Haskell has one of the _clearest_ distinctions between red/blue functions that I know. Functions that perform I/O (sans unsafePerformIO and other strange things) are very explicit about it (by returning an `IO` monad).
Most other languages (static or not) are equally explicit (returning a `Promise<A>` and not an `IO a` but it's very similar. Most differences are because call-by-need vs. call-by-value language semantics (Haskell being non-strict and the I/O not coming out of "thin air" but coming from somewhere - but that's kind of a different point than function color type)
While "what colour is your function" essay highlights the author's pet peeve, asynchronous functions, I always understood it to be less about the underlying implementation, but about syntax and semantics; the whole point is that control flow ends up infecting function-level semantics. That problem extends to anything else a language can treat at "coloured".
For example, in Haskell, side-effectful operations end up being "infected" with the IO monad. This means you're not free to mix and match functions — the moment you need to call some IO function, all callers up the stack need to be monadified, too. This might be a late change — suddenly you need a logger or a random number generator, and it has to be passed all the way up from the outermost point that uses monads. In practice, monads are so deeply ingrained in Haskell now that most devs probably don't see this as a colour problem.
Multi-value-returning functions in Go is another example of colour. The only way to use the return value of a Go function that returns a tuple is to assign them:
value, err := saveFile()
if err != nil { ... }
This means functions like these aren't composable. I can't do saveFile().then(success).fail(exit) or whatever, like you can in Rust. The moment you have a function returning more than one value, your only option is to create a variable. It's weird.
Interestingly, you can do this, but I've never done it and never seen it in the wild:
Well that's true. I do think there's some value to the fact that if you call a JavaScript function you are guaranteed that no I/O has been executed (provided no on is using the old sync API's) before the function returned. I lean on this property often, and I don't know any other language besides Haskell that has this property.
I do agree that Haskell is even stronger because it doesn't just guarantee the I/O has not yet been executed yet, it also reveals the intent of I/O through the use of the I/O monad.
Actually, I think I/O is really clear in Haskell. Haskell is a pure functional language, and in that world side-effects are not possible (since a function should always return the same value, given the same input). For this reason functional languages like Haskell introduce monads, which allow you to model imperative languages (with state and order). I/O is performed in one such monad (the IO monad), and programming in the IO monad just looks like imperative programming. The tricky part is that you cannot get pure values out of the IO monad (since it is impure), which may entice programmers to let the IO monad 'leak' intro programs to much, rather than lifting pure functions into the monad.
Yes, monads can be hard to get in the beginning, but they never felt like tricking the compiler to me.
If you compare Haskell with JavaScript, all JavaScript functions would run in the `IO` monad because you can always (even implicitly as in the article) run an impure action even if everything else seems pure. In Haskell, if your function is not monadic, you can't "run" the monadic action. On the other hand, you don't see exceptions in Haskell's type signatures and you can throw them in pure code. Hm! ;-)
The way I think of it is that an IO value is a program where all the Haskell code runs in callbacks. This is a bit like a JavaScript program where everything runs in a callback. The major difference is that callbacks written in Haskell are constrained to be pure functions.
The IO type is rather rigid in that there is always a next callback, which implicitly contains the entire program state as part of the function closure. In a reactive programming language like Elm you can have many functions that may be the next callback depending on what the next event is, along with all the callbacks that run as part of the signal graph. Purity is about how constrained the callbacks are, not about the overall structure of the program.
I used to write a lot of Haskell and tbh I find the idea of restricting IO to the edges of an application to be equally as important in traditional imperative programming languages as well. It's just good design, not a PITA.
It seems to be a common view point that Haskell makes IO difficult. I actually personally found that it was more powerful because I was being more explicit about it. It also allowed for higher ordered functions to abstract common patterns in IO. The only difficulty in Haskell was not due to monads or type safety, but rather in getting my brain to understand lazy evaluation by default.
I'm not necessarily disagreeing with you, but is there an example for a function that doesn't make sense to call from both an async and non-async function, other than an implementation detail of the programming language?
But continuing on your idea, one colouring I would really like to see everywhere is pureness. C++'s `const` comes close, and perhaps Haskell does it right by having everything be (mostly) pure by default and using the type system to encode other properties. And yeah - different monads (like IO) are likely a good way to handle this whole topic in a user-definable way, but not many people like them (at least when they know they are using them, because unknowingly many programmer do use it, eg. most async syntactic sugar is basically a do notation for a single hard-coded Monad)
Quite the opposite. In Haskell I/O is an explicit effect. In impure languages I/O is something that happens as a side effect of just calling a function.
Note: The approach of structuring the interactions with the IO type with the functions (bindIO :: IO a -> (a -> IO b) -> IO b) and (returnIO :: a -> IO a) is still using the abstract idea of monads to organize the impure code and make it ergonomic to work with, so "monadic I/O" or "monadic state" aren't entirely misnomers. The thing I wanted to emphasize is that you don't need to know the word "monad" or understand anything in particular about the design process for the Monad typeclass in order to use these libraries.
I think focusing on the "monad" part over the "IO" part of "monadic IO" is particularly confusing to new users because the abstract idea of a monad is very general, so if you assume all places where it shows up are basically like the case of IO, you will be very confused. Further, it makes the idea of a monad seem like a Haskell-specific hack, rather than a general abstraction that can be used in any programming language you want to.
This is particularly important to emphasize because the abstract idea of monads only makes the IO approach to impurity nice to use, it doesn't make it possible. Haskell had I/O (and other impure capabilities) before the monadic way of organizing impure code was introduced. The heavy lifting for IO is done by having a type system strong enough to prevent a function of type IO a -> a from being written by an end-user. If you have written a monad abstraction in a language without such a type system[0], it can still be a nice abstraction, but it doesn't guarantee that pure and impure computations can be distinguished on the type level.
The win for Haskell is that it wears its statefulness on its sleeve. Of course you can, with sufficient discipline, write pure code in any language, though it can be frustratingly difficult in languages where libraries make idiomatic use of mutation. And yes, you can also write impure code in Haskell, at least for some definitions of "impure".
What you can't do in Haskell is write impure code that claims to be pure (up to the customarily and idiomatically avoided `unsafePerformIO`), or write code whose degree of statefulness is ambiguous. Reliable, explicit, and statically enforced purity holds more than just theoretical benefits. I don't have to read through library code to see if it is thread-safe. I don't have to worry about whether passing a data structure to a library function will result in that structure being mutated behind my back. I don't have to trust code comments that may not be in sync with the current state of the code. Simply by virtue of the fact that a function does not mention `IO` in its type, I can have total confidence that it won't violate my assumptions about its behavior. And I don't even have to take it on faith that the author wrote the correct type for his code; if he hadn't, the compiler would have rejected it. This is the difference between "pure by convention" and "provably pure".
Every language must necessarily have an "IO monad" and interact with the inherently stateful world, or else it is useless. Haskell is different not because you can choose to avoid I/O, but because when you do so, the type system will back you up with perfect accuracy.
a and b don't actually matter, because what matters is how the function is called. If given some `f :: Monad m => m -> m`, then you can only call f with IO as the monad from an already impure context. For example, you can't call f with IO from inside a function (g :: a -> b).
The purpose of IO in Haskell is to explicitly mark side effects, because they cannot be arbitrarily composed in the way pure functions can. IO represents a one-way boundary in that you can turn some pure computation into an impure one (a -> IO a), but there is no way of "extracting" that computation back from IO (i.e, there exist no (IO a -> a)). That "monads" are used to do this is useful because they provide the (a -> IO a), and happen to have a convenient function for chaining computations (IO a -> (a -> IO b) -> IO b).
How IO is defined is up to the implementation, and not in the scope of the language - different implementations could use a different representation for IO - what matters is that it must be defined in a way that one cannot define an (IO a -> a).
On "a Haskell program is a giant expression that reduces to an object of type IO", this is really nothing to do with Haskell, but a consequence of how we've built our operating systems and how the defacto meaning of "program" these days is equivalent to an "executable file". Traditionally "program" was much more abstract and could refer simply to any piece of code, such as a (pure) haskell function. We can consider any Haskell function to be a "program" in itself. If we had an environment from which we launched processes which provided all the command line switches and environment variables as arguments, we could easily omit the "IO" from our "main", if the rest of our code was pure. (On a side note, this is precisely what early versions of Haskell did, before monadic IO became practical)
The same way you'd write "code that doesn't cause side effects" in Haskell. You cause side effects. The thing is with promises (like IO) the side effects are limited to a box you can open, process and close.
So given the same input the same function will always return the same output. If you have a function that asynchronously computes a random number - it won't return different numbers and will always return a promise for a random number. It will never be observable that being called with a different input it produced a different output (as long as you don't mutate global state or anything like that).
In Haskell, IO is implemented in a way that causes side effects at the platform level. If there are no side effects there is nothing going on and the program is a no op after all :) Monads like IO are about _limiting_ side effects and _controlling_ state.
In Haskell, the lack of side effects is by the fact IO is provided by the platform. There is no fundamental difference between Haskell's `getLine` and the DOM's `fetch`, both return a boxed value.
> As soon as you have side-effects (an IO is a side-effect) you can through your assumptions about "strict conformation to definition" out of the window
You misunderstand what an IO action is. An IO action in Haskell is a value which describes a side effect. For example, an IO Int is a value that describes a side effect which, at runtime will produce a value of type Int inside the IO monad.
The difference is that the IO Int is a value, it’s a constant which describes how to perform something at runtime (also called promises in some languages). It’s like a callback function, which takes a value as an argument that will be available when it’s called (at runtime).
Sorry for the additional pedantry, but I think this important to be precise about given the target audience of your comment.
Monads aren't the separation between purely functional and stateful code. The Haskell type system maintains that separation. Anything that's doesn't return IO a for some a appears to be a pure function from the perspective of the programmer. Once a function returns IO a, there aren't any* functions provided by the compiler that can make a function that uses those results not also return IO b for some b. For example, the type of getLine is IO String (because it impurely produces a String) and the type of putStr is String -> IO () (because it takes a String and mutates the world without returning anything).
If the compiler provided a function for computing on the a in the IO a, for instance, bindIO :: IO a -> (a -> IO b) -> IO b and a function to wrap the results of non-IO functions, such as returnIO :: a -> IO a, you could do arbitrary computation with these IO-wrapped data types, but know at a glance if your functions were impure.
This approach doesn't require the Monad typeclass at all, just a magic type called IO that tags impure computations that are implemented with compiler and runtime magic. It happens to be the case that this is exactly how GHC implements the IO type. bindIO is implemented here[0] and returnIO is implemented here[1] and the compiler magic used to implement them isn't* exported, so all IO operations have to go through those functions. It is not a coincidence to that these functions have the right types to form a Monad instance for IO and indeed, that is also present[2], but the IO type and the type system that ensures it can't be sneakily hidden are doing the heavy lifting, and the Monad instance (and accompanying syntactic sugar), are just there to make it nicer to work with and easier to abstract over.
If you have a passing familiarity with Haskell, the phrase "state monad" is the obvious place where my claims stop making sense. In fact, the State type only supports computations that are entirely pure. If you want to simulate global variables in a language that didn't have them, you could always pass all of your global variables to every function and get updated ones back from the function along with the nominal results of the computation. The State type is just a regular data type that wraps stateful functions constructed by such state passing. A type of the form State Int String is just a function that takes an Int and returns and String and an Int, no compiler or runtime magic needed.
You can play the same trick as in the IO case and provide functions bindState :: State s a -> (a -> State s b) -> State s b and returnState :: a -> State s a in order to compute on these "stateful" values while making sure the result state got passed to the next function in the chain correctly. Like IO these two functions can be used to create a Monad instance for State. Unlike IO, State is just a data type holding a regular Haskell function, so it's extremely reasonable to write a function of type State s a -> s -> a which runs the State s a computation with an initial value of type s. This is written by unwrapping the State type and then passing the initial state value to the function inside and return the result while ignoring the returned new state. More details on how State is implemented are available here[3].
A complication to this is that if you want stateful mutation for performance reasons, the ST type[4] also exists, which looks identical to the State type from the programmer's perspective, but plays similar tricks to IO in order to actually mutate under the hood while not exposing the implementation details to the user, so it can be reasoned about exactly as if it was pure and using the same implementation as State.
These Monad instances for IO, State, and ST start to pull their weight when you write functions that only use features provided by the Monad typeclass and they work seamlessly with any implementation of stateful computation despite their very different internals. Monad is quite general, so if all you care about is abstracting over stateful computations, you can also use the methods from MonadState[5] which allow you to interact with the state along with the results of the computation independent of the implementation of stateful computation.
* In the name of not getting bogged down in details, there are a few parts of this discussion that are not entirely accurate, particularly around functions like unsafePerformIO[6].
Having a distinction of if something does IO or not is quite important, and the fact that haskell has the ability to encode it in types is what other languages don't have.
Secondly, IO is just one monad. You can build a more granular one where you can separate filesystem, network etc, and encode this into types (so you know at a glance). You can't do this in most other languages.
What are the types of things that are worth being hung up on for a language for you?
I'm a little disappointed that the article suggests that the IO monad makes Haskell impure. While it is certainly true that unsafePerformIO violates purity(and type safety, as well), without it, IO is perfectly pure. It's just that if an IO a value happens to end up in main or something referenced from main, input and output happen. From the language's perspective, the IO monad functions could all be perfectly pure and have no relation to input or output(putStrLn could be a no-op, input functions could always return the same thing). It would be utterly useless for its intended purpose, but there would be no difference in terms of its purity.
Most other languages (static or not) are equally explicit (returning a `Promise<A>` and not an `IO a` but it's very similar. Most differences are because call-by-need vs. call-by-value language semantics (Haskell being non-strict and the I/O not coming out of "thin air" but coming from somewhere - but that's kind of a different point than function color type)
reply