I'm failing to understand some things, maybe because I glanced through the paper with morning coffee. Apology if my tone ends up a little bit harsh, but these are just constructive criticism :)
The first one is lt2 and lt3 implementations. You are implementing cmp2 through lt2, but lt3 through cmp3 (is this omission?). Both of them are stack-sensitive. Without being too harsh, I'm getting the impression that the intention was to write the most horrible comparison possible, which is different than worst-case time complexity.
In the paper, lt2 (actually cmp2 in the paper) will always be at least two passes, and lt3 is at least one pass. I would not say they are two/single pass algorithms because complexity increases depending on list depth when lists are involved.
Maybe I'm wrong, but both Python and C++ comparison operators are designed to be general-purpose comparison functions (and I'm more sure about C++, because this was touted through hundreds of books). As such, they should be good enough for most average cases. If you want speed, you go with balanced trees or something funkier.
Also, for C++ Tree implementation, you are again using probably the worst approach - appending to vector recursively. Use list for this. Python list implements all sorts of tricks, comparing to C++ vector.
And the last thing, but not the least: C++ containers depend on implementation (gcc libstdc++, stlport, msvc whatever), and I've seen substantial speed differences in standard operations. Hell, my old (almost conforming) list implementation was much faster than libstdc++ implementation because it wasn't trying to be too clever with slices and other magic.
I'm sad you haven't used a more scientific approach with much more rigor here: what C++ compiler was used, what version, what assembly output was produced, on what processor, after how many runs, etc... Claiming "Python is faster than C++" sounds like a clickbait title.
I'm not trying to write the most horrible comparison at all. Perhaps the most important thing to keep in mind here is that this is a general computer science paper, and my comparison of C++ and Python is just intended to serve as a familiar (and vivid) illustrative example of the general phenomenon I'm trying to describe. The paper is emphatically not intended to be a "Python vs. C++" paper. Everything you see there that is "concrete" (the language, the running times, etc.) is intended to be a mere illustration of the overarching concept (design decisions & their consequences) being discussed, and it could manifest itself in any language.
The context to keep in mind when reading the paper is: When designing a programming language & its standard library (or any API), we need to define an interface we can use as a building block, and we're analyzing the consequences of our choice of building blocks. In particular, we first examine the case of comparison-based data structures, which requires defining ordering primitives. In C++, the primitive is the < operator. In Python 2, it's cmp(). (In Python 3, it's a mix of < and ==, whose implications I discuss as well.) We assume user-provided types implement that basic interface, and we implement everything else we need on top of that.
So the question I'm analyzing in that example is: What happens if my primitive comparison operation is a 2-way comparison (lt(), like in C++) and then I implement 3-way comparison in terms of that (such as when I need it for for a binary search tree)? Now, what if we do the opposite: what happens if instead my primitive comparison operation is a 3-way comparison (cmp(), like in Python 2) and I only need to perform a 2-way comparison later? What are the trade-offs?
To do this, I take both approaches, implementing each in terms of the other, and compare how they behave complexity-wise. The conclusion I come to is that the choice of the primitive (which is often made by the language designers) isn't merely a question of aesthetics, but rather, it actually affects the time complexity of common operations (like traversing search trees). Similarly, the decision to cache a hash code doesn't just result in a constant-factor improvement, but it can actually change the time complexity of a program. And so on.
I think if you re-read the paper with these in mind, it should hopefully make more sense. The rest of what you said doesn't enter the picture at all... these are already balanced binary trees, the decision to use less<> is fundamentally independent of what C++ stdlib implementation you use, and the time of the vector concatenation isn't even being measured. Those things are unrelated to the point of the paper entirely. I was just trying to minimize the extraneous lines of code so we can focus on the heart of the problem instead of getting distracted by boilerplate.
Thanks for the interesting case. I guess many readers have read too much into lt2 and lt3 but overlooked the code under 2.4.2: C++ could actually be that bad. In that code, you only showed the timing in comments. The result may be worth a table. Perhaps another way to structure the manuscript is to give the surprising result of 2.4.2 first and then to explain what makes that happen.
Yeah—I think I laid out the paper more like a story, but I might indeed need to change that as it appears it leaves people confused before they get to the punchline. Thanks for the feedback! Glad you found it interesting.
The first one is lt2 and lt3 implementations. You are implementing cmp2 through lt2, but lt3 through cmp3 (is this omission?). Both of them are stack-sensitive. Without being too harsh, I'm getting the impression that the intention was to write the most horrible comparison possible, which is different than worst-case time complexity.
In the paper, lt2 (actually cmp2 in the paper) will always be at least two passes, and lt3 is at least one pass. I would not say they are two/single pass algorithms because complexity increases depending on list depth when lists are involved.
Maybe I'm wrong, but both Python and C++ comparison operators are designed to be general-purpose comparison functions (and I'm more sure about C++, because this was touted through hundreds of books). As such, they should be good enough for most average cases. If you want speed, you go with balanced trees or something funkier.
Also, for C++ Tree implementation, you are again using probably the worst approach - appending to vector recursively. Use list for this. Python list implements all sorts of tricks, comparing to C++ vector.
And the last thing, but not the least: C++ containers depend on implementation (gcc libstdc++, stlport, msvc whatever), and I've seen substantial speed differences in standard operations. Hell, my old (almost conforming) list implementation was much faster than libstdc++ implementation because it wasn't trying to be too clever with slices and other magic.
I'm sad you haven't used a more scientific approach with much more rigor here: what C++ compiler was used, what version, what assembly output was produced, on what processor, after how many runs, etc... Claiming "Python is faster than C++" sounds like a clickbait title.
reply