Hacker Read

Hacker Read top | best | new | newcomments | leaders | about | bookmarklet

login

		The speed, size and dependability of programming languages (gmarceau.qc.ca) similar stories update story
		176.0 points by trjordan \| karma 3493 \| avg karma 6.85 2009-05-31 14:04:21+00:00 \| hide \| past \| favorite \| 59 comments

view as:

Confusion | karma 7759 | avg karma 2.9 2009-05-31 15:02:01+00:00 | [–] similar comments

Shows and compares graphs plotting performance vs. code size, based on the programming language benchmarks at http://shootout.alioth.debian.org/. Note that 'dependability' here means 'the consistency of the performance or code size'.

Few solid conclusions can be drawn: small performance benchmarks are hardly indicative for both the code size and the performance of most actual projects in a language.

petercooper | karma 48277 | avg karma 5.12 2009-05-31 10:29:40 | [–] similar comments

The references to Ruby aren't entirely what they seem in here. They seem to be relating to Ruby on the older shootouts, rather than the recent ones. That is, the uber-slow Ruby 1.8, rather than the Perl-beating and PHP-matching Ruby 1.9 which is more closely related to "yarv" on this list.

DavidMcLaughlin | karma 932 | avg karma 10.24 2009-05-31 17:00:04+00:00 | [–] similar comments

> Perl-beating and PHP-matching

Which is interesting considering Perl beats PHP on both axis in the linked article.

bad_user | karma 29557 | avg karma 3.94 2009-05-31 12:53:59 | [–] similar comments

Maybe that's because Ruby 1.9 isn't being used in production yet. I tried using it but many gems are still incompatible with it.

SwellJoe | karma 32744 | avg karma 4.79 2009-05-31 14:14:17 | [–] similar comments

rather than the Perl-beating and PHP-matching Ruby 1.9

Maybe I just can't read, but Ruby 1.9 appears to still be slower than Perl on more than half of the tests. I wouldn't call that "Perl-beating".

draegtun | karma 4724 | avg karma 3.12 2009-05-31 19:20:10+00:00 | [–] similar comments

Its great to see Ruby so much faster with 1.9 because 1.8 speed was such a stumbling block in data intensive stuff that I pushed towards it.

However don't count your chickens just yet with regards to Ruby 1.9 being "Perl-beating" ;-) Don't you think its odd that Perl 5.10.0 sits only just above Ruby 1.8.7 in the current alioth shootout? Quick look at the tests and I can see a problem with a known 5.10.0 speed issue bug. Fixing this in a quick local test here makes all the difference ;-)

igouy | karma 1139 | avg karma 0.41 2009-05-31 21:28:03 | [–] similar comments

> Fixing this in a quick local test here makes all the difference ;-)

It makes all the difference to that one benchmark - not to the overall situation.

silentbicycle | karma 11689 | avg karma 3.17 2009-05-31 10:43:19 | [–] similar comments

Very good graphs. Aside from X (time) and Y (code size), the center of the star implies "typical" performance, and the relative spread of the star shows specific data points. Ruby, for example, shoots out from the bottom right - to me, this says, "This is quite expressive, and if you know what you're doing, you can get reasonably good performance". It doesn't show memory usage or sample size, however - Rebol probably has significantly fewer samples than Python, for example, but it only looks like half as many.

It's also interesting how the LuaJIT star has about the same shape as the Lua star, but with 1/3 the width (which is consistent with my experience); Psyco relates similarly to Python. Surprisingly, the Lua star's center appears to be at the same X as GHC's, but lower down, and LuaJIT's overlaps OCaml's. While I can accept well-written Lua being approximately as expressive as Haskell (depending on the problem), I don't think it's in quiiiite the same ballpark as OCaml, speed-wise.

The most interesting shape, to me, is Io's: While the default performance seems pretty poor (the right edge meaning 8x as slow as the left edge), there are bands shooting all the way to the left, without adding any real height. I don't have any experience with Io (I'm curious, but have never had success porting it to OpenBSD/amd64); can anybody here comment on this?

magv | karma 232 | avg karma 5.8 2009-05-31 12:32:31 | [–] similar comments

> The most interesting shape, to me, is Io's

There seems to be a mistake. If you compare Io to JRuby [1], you'll find that Io is always the slower one, even though the picture in the article suggests otherwise.

[1] http://shootout.alioth.debian.org/gp4/benchmark.php?test=all...

silentbicycle | karma 11689 | avg karma 3.17 2009-05-31 19:33:37+00:00 | [–] similar comments

Ah, ok. Also, it seems to be missing about half the benchmarks.

by | karma 280 | avg karma 2.26 2009-05-31 18:09:21+00:00 | [–] similar comments

"Ruby, for example, shoots out from the bottom right - to me, this says, ... if you know what you're doing, you can get reasonably good performance" Sorry, but it does not say you can get reasonably good performance, it says this version of Ruby is slow, one of the slowest languages in the Shootout. And the graph itself says nothing about the skill level required to obtain the exhibited performance.

silentbicycle | karma 11689 | avg karma 3.17 2009-05-31 13:16:01 | [–] similar comments

> And the graph itself says nothing about the skill level required to obtain the exhibited performance.

In all fairness, neither do most people promoting C++'s performance-at-any-cost design.

I think a large spread in the performance correlates with requiring skill level, though -- it means that some people using the language get good results, but that many also get very bad ones. Ruby is a really slow language ("reasonably good" < great), but it's still good enough for many purposes, provided your algorithms are reasonable.

stcredzero | karma 37049 | avg karma 2.13 2009-05-31 23:25:48+00:00 | [–] similar comments

Ruby is not a really slow language. It's got some really slow implementations. Ruby has a high cost to entry for a VM/tool developer because it has a relatively byzantine syntax which has been hard to standardize. You can write code that has the same tokens yield entirely different syntactic elements, based on something random, like, say, what day of the week it is, which you can do in Python or Smalltalk, but in Ruby you can do this by accident with statements that look very innocent. (This is covered right in the Practical Programmer's Ruby book.)

If dealing with Ruby's syntax were easier, Ruby implementations would progress faster.

gaius | karma 35963 | avg karma 2.51 2009-06-01 04:29:27 | [–] similar comments

You would think we'd learnt nothing from "only Perl can parse Perl".

silentbicycle | karma 11689 | avg karma 3.17 2009-06-01 06:49:15 | [–] similar comments

That sounds more like, "its design interferes with fast implementation" to me. If there's a major difference between that and "it's slow language", I'm really curious what it is.

stcredzero | karma 37049 | avg karma 2.13 2009-06-01 07:56:55 | [–] similar comments

You missed a meta-level. The design doesn't interfere with having a fast implementation. It interferes with implementing a fast implementation.

"It's a slow language," implies that there's something special about Ruby that precludes (relatively) fast implementation. There's nothing special about Ruby like this. Ruby is a very nice remix of language features that existed before. Getting a language to be fast can require many iterations. Ruby's syntax is a high barrier to entry to implementers, so fewer eyeballs have been looking at the problem.

Contrast this with Smalltalk. http://news.ycombinator.com/item?id=619398

Smalltalk was designed to be quickly implemented by a single programmer. Implementing a "Bluebook VM" was treated as a rite of passage for support engineers for at least one of the vendors. As a result, people have implemented many, many variations on Smalltalk. (Including an VM with built-in OODB, another that can compile down to a C++ compatible DLL indistinguishable from one written in C++, another that ran on the old Palms, one with optional typing, and many others.)

I think it's fair to say that writing a Ruby VM is probably an order of magnitude harder.

silentbicycle | karma 11689 | avg karma 3.17 2009-06-01 12:00:32 | [–] similar comments

Got it. FWIW, I don't particularly care for Ruby; for that style of programming I usually use Lua, whose syntax and semantics are quite simple.

The existence of LuaJIT (http://luajit.org/luajit.html) supports your argument, I think.

carbon8 | karma 1261 | avg karma 2.97 2009-06-01 01:08:51+00:00 | [–] similar comments

"Ruby, for example, shoots out from the bottom right - to me, this says"

"yarv" is the graph that represents what is now Ruby 1.9, "ruby" only represents Ruby 1.8

mullr | karma 929 | avg karma 2.79 2009-05-31 21:55:40 | [–] similar comments

Io can do fast vector processing if you use its vector library. That's what I assume is going on.

bayareaguy | karma 1906 | avg karma 1.4 2009-05-31 10:53:06 | [–] similar comments

Languages which include functional features such as lambda, map, and tail call optimization are highlighted in green.

I found one minor inconsistency: Lua has all these but is still colored blueish.

chops | karma 3597 | avg karma 7.27 2009-05-31 16:04:35+00:00 | [–] similar comments

It doesn't have a native map command, but that's a rather trivial implemention.

But I agree, it should be colored green in the "functional programming" section.

silentbicycle | karma 11689 | avg karma 3.17 2009-05-31 11:16:05 | [–] similar comments

Lua isn't usually associated with the functional languages. Besides mainly being promoted for embedded / extension use, I think the single biggest reason for this is that, while it has tail-call elimination, closures, first class functions, etc., you still have to say "return x" at the end of functions. I do a lot of FP-style stuff in Lua, and while the language doesn't actively resist it the way Python does, it's not the main direction its design flows.

It seems like a small thing syntactically, but languages that return the result of their last expression by default (I think of them as "expression" languages), and that require an explicit block for side-effects (begin in Scheme and OCaml, progn in Common Lisp, etc.), tend to get idiomatically used in a functional manner, while languages that default to a series of statements with side-effects and require an explicit return at the end ("statement" languages) typically do not. While you can use several of the latter languages for functional programming, only the former are functional "by default".

Saying "function(x) return x + 1 end" is a tiny bit more trouble than "fun x -> x + 1" or "(lambda (x) (+ x 1))", so while lambdas still get used where they're the best fit (as arguments for map or filter, for example), they're less common overall. Defaults influence a language's overall style at least as much as what's theoretically possible. Python's concept of what's "pythonic" seems to be the most explicit acknowledgement of this.

Incidentally, I would love to see a language that crossed Lua with the ML family: a small, portable, clean implementation that interfaces easily with C, but with type inference, ML's top-notch module system, and (ideally) Haskell-style type-classes. OCaml is such a good language that I'm willing to tolerate its warts, but I'd strongly prefer a minimal ML-like language. (FWIW, I haven't used SML, though that's just because I do most of my programming on OpenBSD/amd64 and the smlnj port is i386 only.) When standalone, Lua is best suited to small/medium projects. Its historical emphasis on embedding in an existing project has made a module system for standalone use a bit of an afterthought.* As a project gets larger, static checking and stabilizing the module interfaces becomes far more important, and this is one area the ML languages really excel.

* For a good summary, see "Almost Good Enough to Scale: A Lua Mail Handler and Spam Filter" (http://www.lua.org/wshop08.html#ramsey)

gruseom | karma 30328 | avg karma 5.54 2009-05-31 16:33:16+00:00 | [–] similar comments

you still have to say "return (val)" at the end of functions

We have a bunch of code that compiles to both Common Lisp and Javascript, and those nasty "return" statements are one of the two biggest mismatches between the two languages (the other being Javascript's weird semantics around null, false, 0, and ""). One of these days I intend to try inserting them all automatically.

silentbicycle | karma 11689 | avg karma 3.17 2009-05-31 17:05:27+00:00 | [–] similar comments

Indeed. It's surprising how deeply eliminating return changes the language semantics.

I've postponed learning Javascript until recently, because my work hasn't usually been web-oriented, but it's interesting how similar Javascript and Lua are. They seem to have been aimed at the same target, but for historical reasons Javascript's development got frozen early, while Lua had time to iron out many similar design flaws.

(In Lua, undefined and null are both nil, nil and false are "false-y", and everything else, including 0 and "", are truthy.)

gruseom | karma 30328 | avg karma 5.54 2009-05-31 17:27:39+00:00 | [–] similar comments

I think you hit the nail on the head earlier. It isn't "return" as such, it's expressions vs. statements. Backus got it right. What's surprising is that the superior way gets the tiny minority of usage. (Not so surprising if you know the historical reasons.)

JS is not so bad. We're lucky that what we're doing (writing high-level FP code that compiles to JS and runs in pretty much every web browser in the world) is doable at all.

JS even has its own version of progn: "(a,b,c)" evaluates to c (after evaluating a and b). We use that pretty heavily. Alas, it has two problems: there are some things you can't do inside the block (like declare new variables or have for loops), and it makes the JS harder to read and debug. Otherwise we'd probably just compile everything that way.

avibryant | karma 926 | avg karma 5.68 2009-05-31 21:37:48+00:00 | [–] similar comments

Can you get around (at least some of) those restrictions by wrapping things in function expressions? Here's a for loop: ((function(){ for(x in [1,2,3]){print(x)} })(), 42)

gruseom | karma 30328 | avg karma 5.54 2009-05-31 18:11:41 | [–] similar comments

Yes, and we have some macros that do just that (a Greenspunned version of half of CL's LOOP comes to mind). But it's not great as a general solution because of what it makes you give up in performance and readability. Also, it doesn't solve the problem of declaring new variables.

Every now and then we hit upon a new abstraction that lets us remove another chunk of ugliness from our code (while still generating acceptable JS) and I'm hoping that the trend will continue, especially if we can get rid of all those "return"s.

bayareaguy | karma 1906 | avg karma 1.4 2009-05-31 12:09:40 | [–] similar comments

Yes. Playing around with Metalua[1]'s anonymous functions[2] which allow you to shorten

  function(arg1, arg2, argn) return some_expr end

to the much more concise

  |arg1,arg2,argn| some_exp

makes me wish they were part of core Lua.

1- http://lua-users.org/wiki/MetaLua

2- http://metalua.luaforge.net/manual003.html#toc3

nazgulnarsil | karma 6752 | avg karma 2.15 2009-05-31 16:15:20+00:00 | [–] similar comments

dependability mostly is a stand in for a program making more assumptions for you no? In a language that makes a lot of assumptions its easier to create something where assumptions conflict somewhat invisibly.

herdrick | karma 2697 | avg karma 2.89 2009-05-31 12:11:14 | [–] similar comments

This is great.

If you've looked at Peter Norvig's spell checker shootout you won't be surprised much. Ex., the story on Common Lisp and Scheme implementations is the same here: MzScheme is the mediocre best of an unimpressive lot. (I want to see Clojure and Arc on this.) Python looks great again. F# looks weak here but wins big in the spell checker with reasonable-looking code. Sounds interesting.

What's the deal with Squeak? I had thought that it very concise.

The meta-message here is best of all: people are measuring language conciseness.

silentbicycle | karma 11689 | avg karma 3.17 2009-05-31 12:22:56 | [–] similar comments

Well, arc's running on top of MzScheme, so it's likely to be a bit slower. Also, Stalin in the left column is a Scheme implementation, albeit one that is very restrictive for sake of raw speed.

Also, it's worth noting that the features in a language that make large systems' codebases manageable (such as good module systems) are different from the features that allow you to pare down small scripts further. It's hard to show features supporting conciseness-in-the-large in a one page benchmark, so IMHO the latter is often greatly overemphasized. Being able to knock 2 lines off 20 is often just a parlor trick, though, while 50kloc off of 200kloc can mean life or death for a project. The programming world would be a happier place if the size of every legacy C++ / Java codebase were cut by a double-digit percentage.

granular | karma 135 | avg karma 3.38 2009-05-31 17:51:16 | [–] similar comments

> > (I want to see Clojure and Arc on this.)

> Well, arc's running on top of MzScheme, so it's likely to be a bit slower.

I'd be interested to see how Arc fares on the code-size axis.

silentbicycle | karma 11689 | avg karma 3.17 2009-06-01 17:30:31+00:00 | [–] similar comments

Are there any large projects using it?

granular | karma 135 | avg karma 3.38 2009-06-01 22:38:08+00:00 | [–] similar comments

I don't know much about the Shootout, but my guess is that, if someone wants Arc there, they need to go and write whatever programs the Shootout requires you to write in order to participate in the Shootout.

I'd try it, but I don't know Arc (or Lisp yet, for that matter). If someone here knows Arc and wants to see it in the Shootout, I suggest they look into the faq: http://shootout.alioth.debian.org/u32q/faq.php

avibryant | karma 926 | avg karma 5.68 2009-05-31 13:38:18 | [–] similar comments

Just a guess, but: Squeak only uses text files as an export format, and that format is more verbose than file-based source code would normally be. If you only counted the code and not the metadata, it would probably do better.

igouy | karma 1139 | avg karma 0.41 2009-05-31 21:32:40 | [–] similar comments

Not much - Smalltalk code is written in sentences not squiggles.

The productivity comes from reading and re-using the code that has already been written rather than duplicating stuff.

stonemetal | karma 3095 | avg karma 1.68 2009-05-31 18:08:00 | [–] similar comments

Also note that in the benchmark it is a gzip of the source so it is more about how well your source code compresses.

russell | karma 4953 | avg karma 3.08 2009-05-31 12:13:27 | [–] similar comments

It would be a useful to include the language name where it is not obvious. For example, he mentions Haskel, but I cant tell which one it is.

silentbicycle | karma 11689 | avg karma 3.17 2009-05-31 12:19:23 | [–] similar comments

It's "ghc", in the chart.

unbannable | karma 54 | avg karma 4.91 2009-05-31 12:49:31 | [–] similar comments

For this sort of comparison, mostly-functional multi-paradigm languages like SBCL and Ocaml should be run through the benchmarks twice; once with imperative code, once in the functional idiom. The often-repeated complaint (of which I'm skeptical) about functional languages is that they're only efficient when ugly imperative features are used, and it would be good to confirm or dispel that suspicion.

silentbicycle | karma 11689 | avg karma 3.17 2009-05-31 19:40:33+00:00 | [–] similar comments

It's more complicated than "functional is slower than imperative". Functional data structures are sometimes less immediately efficient, but add persistence (since they're immutable, they share subsections of the data structure, and/or have access to snapshots of past versions of the structure being modified). Whether this gives you needed features or just means extra work and cache misses depends on what you're doing. If you have a functional data structure and you want to add "undo" or backtracking, you're already most of the way there, while adding this to an imperative data structure means extra work to keep track of state changes.

_Purely Functional Data Structures_ by Chris Okasaki is, by far, the best collected resource, if you want to read further. (His thesis (http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf) is the kernel of the book.) Also, the fourth chapter of Developing Applications with Objective Caml (English translation: http://caml.inria.fr/pub/docs/oreilly-book/) discusses functional and imperative styles' strengths and weaknesses.

Reducing everything to runtime benchmarks obscures how, in practice, the trade-off is often more like "it was a huge pain in the ass to get working without bugs, but it's 2% faster" versus "it took twenty minutes and was correct on the first try". (Also, whether or not you use it in production, OCaml rules for prototyping complex data structures, in the same way Erlang does concurrency and C++ does linking errors.)

javert | karma 3591 | avg karma 1.13 2009-05-31 13:33:03 | [–] similar comments

This is one of the best things I've seen on the Internet in a while. :-) Definitely worth taking a look.

ellyagg | karma 4088 | avg karma 5.85 2009-05-31 15:04:05 | [–] similar comments

Interesting. Looks like Yegge was right in falling in love with ocaml awhile back.

tlrobinson | karma 30498 | avg karma 4.18 2009-05-31 20:08:40+00:00 | [–] similar comments

The benchmarks used for the article was from 2005. Make sure to check out the 2009 versions:

http://gmarceau.qc.ca/blog/uploaded_images/size-vs-speed-vs-...

V8 has propelled JavaScript into the coveted bottom-left corner, which is awesome. SquirrelFish Extreme should be right up there with it, if it were included.

SwellJoe | karma 32744 | avg karma 4.79 2009-05-31 17:40:05 | [–] similar comments

More reason to believe JavaScript is going to be huge on the server-side. As concise and (nearly as) powerful as Python, Perl, and Ruby, but getting faster at a much more rapid clip...and already required for the front-end, so everybody on the team is already familiar with it. Library coverage will take another couple of years to reach a reasonable level, but I'd bet on it happening.

kinghajj | karma 1281 | avg karma 2.79 2009-05-31 19:33:12 | [–] similar comments

I agree. Once I learned how truly powerful Javascript really is, once you get past the horrible DOM, I knew it could one day become a general-purpose scripting language and not just a web one. As soon as someone makes a good command-line Javascript interpreter, web apps will come as soon as someone creates a CGI extension to that interpreter.

tlrobinson | karma 30498 | avg karma 4.18 2009-06-01 00:39:51+00:00 | [–] similar comments

I'm working on it:

http://jackjs.org/

http://narwhaljs.org/

https://wiki.mozilla.org/ServerJS

silentbicycle | karma 11689 | avg karma 3.17 2009-06-01 02:11:24+00:00 | [–] similar comments

Have you ever looked at Lua?

akkartik | karma 13864 | avg karma 3.99 2009-05-31 23:16:50 | [–] similar comments

I'm amazed gcc is slower than java.

And it's more expressive than scala and f#? Something's gotta be wrong there..

fauigerzigerk | karma 15398 | avg karma 2.9 2009-06-01 07:33:52+00:00 | [–] similar comments

You need to look at how many data points each language implementation actually produces. gcc runs all programs whereas java steady state (the one that looks faster than gcc) lacks a lot of data points (i.e programs). As you can see in the diagram gcc's star is pulled towards the right by one data point (at least it looks like one). If java steady state ran all programs its star might shift to the right as well. Or it might not. It's just not comparable as it is. You need to look at the actual numbers instead of the diagrams.

My own benchmarks that I run on our heavily algorithmic long running code consistently show Java about 10% to 20% slower than gcc (C++) and memory usage about twice that of the C++ implementation (excluding the JVM memory itself). C and C++ performance depends heavily on avoiding malloc/new. C/C++ code that gratuitously allocates and frees heap memory tends to be much slower than Java code.

kingkongrevenge | karma 2053 | avg karma 2.25 2009-06-01 16:23:08+00:00 | [–] similar comments

A two line fix to compile in something like smartheap typically makes that memory allocation penalty go away. Not that it's hard to avoid unnecessary allocation/deallocation.

JulianMorrison | karma 10695 | avg karma 3.4 2009-06-01 13:16:28+00:00 | [–] similar comments

I wonder why mlton doesn't have a higher profile. Is it hard to use, or something?

jongraehl | karma 851 | avg karma 1.3 2009-06-01 23:00:41+00:00 | [–] similar comments

I looked at SML and there seemed to be a few warts compared to Ocaml: http://www.mpi-sws.org/~rossberg/sml-vs-ocaml.html

mlton may not allow separate compilation (it does whole-program optimization at least), and it's pretty slow even on small stuff, but if you were really using just the Standard ML language, you could use a faster compiler for development. I'd probably want to use some extensions, and I wasn't sure if the intersection of mlton's extensions with some other implementation could satisfy me.

silentbicycle | karma 11689 | avg karma 3.17 2009-06-02 17:34:24+00:00 | [–] similar comments

mlton seems to be used for optimized compilation for releases, while general development is typically done in another SML environment, such as smlnj.

I don't have really concrete experience with SML, just OCaml, which also has its share of warts. It's a fantastic language in certain niches, though; generally, the sort of niches for which dynamic / "scripting" languages are usually a poor fit. (I wouldn't write a serious compiler in Python, for example.) They're very complementary.

I think the ML family has a lot of advantages, but the community is pretty closely tied to academia (much like Haskell, but without the buzz). While that in itself is probably neither good nor bad, in this case a lot of the terminology used is especially math-y; There are many features ML has that programmers would find tremendously useful, except they're usually discussed using Greek letters, which is insular. I got a copy of The Definition of Standard ML from Amazon, hoping to get further insight into the language's design, and found it completely impenetrable. (Then again, I'm not a grad student.)

Newer languages seem to be using type inference (finally!), but as far as I know, nobody has really ran with the design of its module system.

ck113 | karma 47 | avg karma 3.62 2009-06-01 14:46:38+00:00 | [–] similar comments

I thought this was the most telling line of the article: "Ultimately the first factor of performance is the maturity of the implementation."

That supports a common conviction held by fans of functional programming: if all of the years of arduous optimization that have been poured into GCC had instead been poured into (say) GHC, then Haskell would be even faster today than C is.

That is, to many people functional programming languages seem to have more potential for performance than lower-level procedural languages, since they give the compiler so much more to work with, and in the long run a compiler can optimize much better than a programmer. But so much more work has been put into the C-style compilers that it's hard to make a fair comparison. It's still hard, but this experiment seems to give some solace to the FP camp.

jongraehl | karma 851 | avg karma 1.3 2009-06-01 17:52:12 | [–] similar comments

Haskell may potentially be compiled to within epsilon of C when using strict evaluation + mutable state + unboxed types (look at the shootout implementations; the majority are in this style), but lazy evaluation and functional data structures have a real cost.

It's not merely that the ghc developers have been insufficiently clever and diligent.

Legal | privacy