Hacker Read

Hacker Read top | best | new | newcomments | leaders | about | bookmarklet

login

rbehrends | karma 2956 | avg karma 3.32 2017-02-20 18:11:52 | [–] update item

> Both C++ and rust encourage exclusive ownership. And that's a good thing.

It's not. It makes lots of important things unnecessarily hard and often leads to unnecessary copying [1]. It makes it hard to even do something like OCaml's List.filter and Array.filter properly. It gets in the way of doing functional data structures (ex: binary decision diagrams). Lots of common design patterns also require shared ownership.

> HOF tie resources to lexical scope; RAII ties them to object lifetimes, recursively. In turn lifetimes may be bound to a lexical scope, but often is not the case.

No, higher order functions don't per se tie resource management to lexical scope (though that's the easiest application). You can build an entire transactional model on top of higher order functions.

Second, tying resource management to object lifetime is dangerous, as object lifetime can exceed the intended life of a resource (ex: closures, storing debugging information on the heap).

Third, RAII is in practice little more powerful than lexical scoping. RAII works poorly for global variables (problems with initialization/destruction order) and thus is in practice limited to automatic and heap storage. Using heap storage leads to the aforementioned problems, where object lifetime can become unpredictable.

[1] https://news.ycombinator.com/item?id=8704318

view as:

gpderetta | karma 12081 | avg karma 1.83 2017-02-21 00:38:59+00:00 | [–] similar comments

>It makes lots of important things unnecessarily hard and often leads to unnecessary copying [1]

If you read the article referenced by that thread you'll see that a major issue with string copying is due to having a lot of bad interfaces taking raw C pointers so code at both side of the ibterface need to make copies exactly because the raw C pointer doesn't guarantee exclusive ownership. The remaining issues are due to a badly optimized string buider in Chrome and failure of pre-reserving vector memory which lead to many copies on resize. This last issue is fixed with move semantics in c++11.

>It makes it hard to even do something like OCaml's List.filter and Array.filter

Std::remove_if works genrically on any range. Boost (and the range TR) provides iterator views when you need lazy evaluation.

> You can build an entire transactional model on top of higher order functions.

I'm sure you can, in the end you can implement anything manually. With RAII propagation of lifetimes is done automatically by the compiler.

>Using heap storage leads to the aforementioned problems, where object lifetime can become unpredictable.

Only if you use shared ownership. Otherwise is completely predictable.

rbehrends | karma 2956 | avg karma 3.32 2017-02-20 19:55:48 | [–] similar comments

> If you read the article referenced by that thread you'll see that a major issue with string copying is due to having a lot of bad interfaces taking raw C pointers so code at both side of the ibterface need to make copies exactly because the raw C pointer doesn't guarantee exclusive ownership. The remaining issues are due to a badly optimized string buider in Chrome and failure of pre-reserving vector memory which lead to many copies on resize. This last issue is fixed with move semantics in c++11.

The point is that something like:

  List.filter (fun s -> String.length s > 0) list

simply cannot be done efficiently, because you require either copying or shared ownership for the strings. This also occurs naturally in a number of other situations, such as storing strings in objects.

> Std::remove_if works genrically on any range. Boost (and the range TR) provides iterator views when you need lazy evaluation.

std::remove_if is destructive. Iterators are not the same thing as a functional filter operation.

> I'm sure you can, in the end you can implement anything manually.

The point here is that a transactional system is more powerful than RAII.

> Only if you use shared ownership. Otherwise is completely predictable.

If you don't use shared ownership, then you're basically limited to lexical scoping.

gpderetta | karma 12081 | avg karma 1.83 2017-02-21 09:53:19+00:00 | [–] similar comments

> The point here is that a transactional system is more powerful than RAII

my point was that RAII trivially maps to transactions (destructors do rollback and commit is explicit). Transaction RAII object can be composed to make more complex transactions.

> If you don't use shared ownership, then you're basically limited to lexical scoping.

why would you say that? A common use case is having objects manually removed from collections triggering cleanup actions, like closing sockets, automatically de-registering from event notifications, sending shutdown events, rolling back transactions. That has nothing to do with lexical scoping.

rbehrends | karma 2956 | avg karma 3.32 2017-02-21 12:59:31+00:00 | [–] similar comments

> my point was that RAII trivially maps to transactions (destructors do rollback and commit is explicit). Transaction RAII object can be composed to make more complex transactions.

RAII is limited in that it's tied to object lifetime, whereas more general transactional semantics can be linked to more general semantic conditions. Also, RAII does not have a good way to distinguish between commits and aborts.

> why would you say that? A common use case is having objects manually removed from collections triggering cleanup actions, like closing sockets, automatically de-registering from event notifications, sending shutdown events, rolling back transactions. That has nothing to do with lexical scoping.

If you use an explicit action to trigger destruction, then this doesn't have to be a deletion. In fact, having it tied to object destruction is unnecessarily limiting. The general rationale for having RAII is that it happens automagically; if explicit disposal is needed, then much of that rationale goes away.

In general, it seems to be me that you don't have actual experience with resource management outside of C++, so you're mostly speculating about what it's like and try to force your thinking about it into a C++-like model.

gpderetta | karma 12081 | avg karma 1.83 2017-02-21 07:53:15 | [–] similar comments

> in general, it seems to be me that you don't have actual experience with resource management outside of C++, so you're mostly speculating about what it's like and try to force your thinking about it into a C++-like model.

yes, I'm a C++ programmer. I have experience with resource management in C# and python for example, which are a pale shadow of what is possible in C++.

I know nothing of resource management in functional languages, especially regarding transactions and I would love to read more about it if you have some pointers (ah!)

rbehrends | karma 2956 | avg karma 3.32 2017-02-21 15:26:18 | [–] similar comments

There's an interesting example of resource management in Haskell with monads [1], but it's probably not easy to follow if you aren't already steeped in Haskell lore, so let me be a bit more basic.

First, note also that it would not be particularly hard to add RAII on top of a garbage-collected language to coexist with GC for resource management; it's just not done in practice. And it's not because language designers are ignorant of it (Bjarne Stroustrop's Design and Evolution of C++ is part of the standard recommended reading list in the field).

Generally, you want resource usage to be a provable property of a program. Not that you'd actually write a formal proof, but you generally want to be able to explain at least informally why resource usage follows certain constraints (e.g. having certain upper bounds).

The basic insight that you need is that resource lifetime is just another semantic property that you can handle with basically the same techniques as other properties of programs; you do not need special language support for it (though, obviously, it helps if your language is a bit more expressive than a Turing machine :) ).

This means that you'll generally tie resource usage to program state and program behavior that you can reason about. The incidental semantics of object lifetime can be dangerous, especially in a functional language, as object lifetime can sometimes be unpredictable.

One of the major hiccups are closures. Closures capture their environment (including local variables) and if they survive the stack frame that created them (because they are returned or stored on the heap), then the lifetime of any captured object can be extended in a fairly unpredictable fashion. Obviously, that is not a good thing, as you have a hard time proving lifetime properties, but few functional programmers would limit themselves to a more trivialized use of closures just for the sake of RAII.

Instead, as I said, you tie resource management to program behavior or state. In the most simple case, that can be scoped resource management. But it can also be an LRU cache, a system based on transactions, or something else entirely. Here's a simple example of a library I sometimes use in OCaml:

  class example = object
    inherit Tx.resource
    initializer print_endline "create"
    method release = print_endline "close"
  end

  let _ = Tx.scoped (fun () -> new example)

This is a simple lexically scoped transaction, but the library also allows for chained, nested, etc. transactions that don't begin and end based on lexical scope, but (say) program events (e.g. terminate a transaction when a socket is closed from the outside and release resources that are associated with that connection). It can also distinguish between commit and abort behavior (similar to the Haskell example above), will properly error if resource creation is not done within the context of a transaction, plus a few other bells and whistles.

[1] https://aherrmann.github.io/programming/2016/01/04/resource-...

sqeaky | karma 2830 | avg karma 1.71 2017-02-21 13:03:23 | [–] similar comments

15 years as a C++ dev and I agree with gpderetta, cleaning all manners of resources is awesome. Even Bjarne, the creator of C++, agrees cleaning up resources in destructors is good. Then the standard committee agrees it is good, because things like std::lock_guard and the custom "deleters" on shared_ptr and unique_ptr exist and work with many resources.

If you have issue managing lifetimes I can see why you might think explicit resource cleanup is better, but with so many scopes and even thread local scope, and move semantics ability to move an object into new scopes, there really is no limitation impose by tying resource cleanup to object lifetime. If you don't like that, then make you own classes to do it explicitly.

rbehrends | karma 2956 | avg karma 3.32 2017-02-21 14:52:39 | [–] similar comments

And as I said before, if your entire perspective comes from C++, it may be too narrow. I'll give you two examples:

1. Modern functional programming languages generally come with compacting, generational garbage collectors that have bump allocators. This means in particular that the cost of heap allocations for temporaries is only marginally higher than that of alloca() and has good locality even for pointered structures (to the point where linked lists can outperform dynamically resized arrays such as std::vector, which is basically unheard of in C++). When heap and stack allocations are that competitive, that opens up a whole new set of techniques that aren't normally used in C++ and lifetime considerations become a lot more complex.

2. Functional programming languages use closures extensively, and closures can have effects on object lifetimes that are difficult to predict. The reason is that closures capture their environment – in particular local variables – and if they survive the stack frame that generated them, this can lead to objects living much longer than you think. It's a major reason why closures and RAII don't get along well (note that C++ didn't have closures until recently and in practice their use is much more constrained than in functional or multi-paradigm languages).

This does not mean that you do not want to have sane resource handling. But in general, you want resource usage to be a provable property of a program, so you will generally tie resource management to program state or program behavior rather than incidental language semantics.

gpderetta | karma 12081 | avg karma 1.83 2017-02-22 11:24:01+00:00 | [–] similar comments

I know that allocation under a good GC is almost free, but even when completely off L1$, pointer chasing kills ILP.

rbehrends | karma 2956 | avg karma 3.32 2017-02-23 07:38:16 | [–] similar comments

Careful, that's not what I'm claiming.

First, allocation is only so cheap if it's temporary. If objects survive minor collections, then there's additional cost, as they get promoted to the major heap. The key idea that I'm getting at is that with temporary objects being cheap, you have more flexibility in creating temporary data structures and do not have to fit them in the constraint of a stack frame and (unlike with alloca()) do not have to worry about stack overflow and they can be returned from a function without copying (unlike stack frame contents).

Temporary data structures will still be small and generally fit in the L1 cache of any reasonably modern processor. And using pointer does not mean that everything is a pointer, and that you're necessarily sacrificing ILP.

gpderetta | karma 12081 | avg karma 1.83 2017-02-24 03:47:02 | [–] similar comments

Yes I was talking about short lived allocations.

I don't discount the power of being able to cheaply create (short lived) highly dynamic data structures. I do miss it in C++ and alloca never feels right.

charlieflowers | karma 2198 | avg karma 2.46 2017-02-22 08:39:26 | [–] similar comments

> Modern functional programming languages generally come with compacting, generational garbage collectors that have bump allocators. This means in particular that the cost of heap allocations for temporaries is only marginally higher than that of alloca() and has good locality even for pointered structures

Interesting. Would you mind naming a few such languages? I'm guessing Haskell. What about OCAML? Any others?

rbehrends | karma 2956 | avg karma 3.32 2017-02-23 07:41:22 | [–] similar comments

I know that OCaml, Haskell, the JVM and Microsoft .NET do it (I think Mono does, too, but am not positive). And I know for a fact that OCaml and the JVM inline allocations and optimize multiple allocations that are close together (e.g. increasing the allocation pointer only once even if you allocate a pair of objects).

It's fairly common and needed for modern functional languages, as they can go through a lot of temporary objects when programming in a purely functional style.

charlieflowers | karma 2198 | avg karma 2.46 2017-02-23 00:52:22+00:00 | [–] similar comments

>> The point is that something like: List.filter (fun s -> String.length s > 0) list simply cannot be done efficiently, because you require either copying or shared ownership for the strings. This also occurs naturally in a number of other situations, such as storing strings in objects.

This is a very salient point. It bounced around in my brain a couple of hours before I came back to comment.

How often do you need to control memory layout and management so you get the absolute best performance? Compare that to how often you need to express filters.

For me, there's no doubt that expressing functional logic and having it be decently efficient is the most important need.

This little example of yours illustrates that a well-designed garbage-collected language has a HUGE advantage over RAII. I may be slow on the uptake, but this is the first time I've seen it that way.

gpderetta | karma 12081 | avg karma 1.83 2017-02-23 07:43:38 | [–] similar comments

Filters are used all the time in C++.

* If the filtered result is used locally in a function (and then thrown away), a filtered view works just fine, is very cheap, efficient and is lazy (nice if you only consume a subset of it).

* Often only the filtered view is used, so you can destructively modify the original list, so no copies.

* If you need both the original list and the the filtered list, in a functional language you need to allocate new cons cells anyway. The cost of allocating, modifying and touching the new memory is going to dominate except for very long strings, so copying is not an issue.

Of course in C++ lists are frowned upon in the first place (as many algorithms can handle any data structure transparently), while they are kind of central in many functional languages.

There are cases of course where frictionless shared ownership is nice and GC shines. Filter is not one of them.

rbehrends | karma 2956 | avg karma 3.32 2017-02-24 15:04:13+00:00 | [–] similar comments

> Filters are used all the time in C++.

The problem is that you need a vastly more complex machinery to cover all the various ways to avoid copying/reference counting and then still don't have a general solution when you can't avoid multiple ownership (unless you count std::shared_ptr with its very high overhead).

As I said before, it's not that you can't do it, it's that there are costs associated with it.

> If you need both the original list and the the filtered list, in a functional language you need to allocate new cons cells anyway.

You can filter arrays also and allocating cons cells for a list is pretty cheap with a modern GC, as discussed before.

steveklabnik | karma 91260 | avg karma 5.08 2017-02-21 05:11:10+00:00 | [–] similar comments

> often leads to unnecessary copying [1].

Rust is very different from C++ here. You wouldn't get those copies unless you directly called .clone(), so you'd know you were doing a deep copy.

rbehrends | karma 2956 | avg karma 3.32 2017-02-21 07:24:14+00:00 | [–] similar comments

That's not the point that I'm getting at. Exclusive ownership invariably mandates copying for certain use cases. Rust allows you to obviate this in a few more cases through borrowing, but as a general rule, exclusive ownership and having multiple references to the same object do not mesh. You need to get rid of either one or the other. That Rust forces you to be explicit about copying in those cases does not make the underlying problem go away.

This is especially noticeable and constraining when you come from a functional programming background and not C++ or when you're doing stuff that's more complicated than shuffling bytes (I've mentioned binary decision diagrams as an example).

The bigger point is: as it is extremely rare for me to write one of the few niche applications that are actively GC-hostile (such as web browsers, AAA video games, or OS kernels), I don't see the point of jumping through all the extra hoops that avoiding GC brings with it.

steveklabnik | karma 91260 | avg karma 5.08 2017-02-21 17:51:28+00:00 | [–] similar comments

> Exclusive ownership invariably mandates copying for certain use cases.

Hm, why would you not use multiple ownership then, instead?

sqeaky | karma 2830 | avg karma 1.71 2017-02-21 13:04:34 | [–] similar comments

I think that is his point, he thinks Rust and C++ cannot do this. But is the default in C++ and well done in Rust.

rbehrends | karma 2956 | avg karma 3.32 2017-02-21 14:57:27 | [–] similar comments

No, my point is not that it cannot be done (both Rust and C++ are Turing-complete, so "cannot be done" does not make sense for anything that's a computable property), but that it comes with a cost. See my other response for the details.

rbehrends | karma 2956 | avg karma 3.32 2017-02-21 14:56:26 | [–] similar comments

> Hm, why would you not use multiple ownership then, instead?

That's exactly what I want. The problem is that (1) it comes with significant runtime overhead and/or syntactic noise in Rust/C++ (or alternatively, lack of memory safety); and (2), it becomes difficult to write code that works equally well for multiple and exclusive ownership (module APIs often become burdened with implicit or explicit ownership assumptions).

steveklabnik | karma 91260 | avg karma 5.08 2017-02-21 15:13:21 | [–] similar comments

Gotcha. I'm not sure I agree, but I understand :)

rbehrends | karma 2956 | avg karma 3.32 2017-02-21 21:32:09+00:00 | [–] similar comments

Well, let me also add that I completely understand that there are use cases for Rust, where a GCed language would be a poor fit. An obvious example is a web browser ( :) ), where the x% memory overhead that comes with a garbage collector is just a price that you may not be able to afford to pay.

In other words, don't read this as "Rust sucks" (I actually rather admire Rust's design), read it as: for my purposes, the practical use cases of Rust are generally too niche to justify the software engineering tradeoffs.

steveklabnik | karma 91260 | avg karma 5.08 2017-02-21 15:57:00 | [–] similar comments

Yeah, I hear you. I think this is fair. And at least some of it comes down to preference, that is, I don't think that the syntactic noise is very much, but others can certainly disagree. The others would mostly be a "nuh uh" since I don't have numbers anyway :)

At the end of the day, Rust can never be great for every single last programmer, and this is entirely okay.

charlieflowers | karma 2198 | avg karma 2.46 2017-02-22 14:45:48+00:00 | [–] similar comments

> Third, RAII is in practice little more powerful than lexical scoping.

I'm beginning to think that we're all going about things wrong :) . We like and rely on things like RAII, macros, etc. But these are nothing more than specific compiler features.

If we were to take control of generating our own code, we could have RAII, macros, and whatever else we dreamed up, easily. And the generated code could be in some readable, debuggable language, so much more visible and obvious than the results of a macro expansion.

For so long we've relied on an amazing black box called a "compiler", when perhaps we should take on the responsibility ("power") of implementing a compiler ourselves. (We could still, of course, generate some intermediate mainstream language, and then those amazing black boxes could take that as input and apply all their optimizations.

Seems to yield clearer code, less magic, more power, and more portability.

Legal | privacy