Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

I have turned to pypy quite extensively for pure-Python text processing tasks, where I often get a 10-100x speed up just by changing the command line invocation. For example, I wrote a proof of concept Rabin-Karp hashing approximate string matching algorithm to perform plagiarism analysis while I was working at Udacity in around 2017. The system never went into production, but pypy was super helpful in crunching all historical user submissions for analysis.

I’ve also had great success using pypy to accelerate preprocessing steps (when they don’t rely on incompatible c libraries) for machine learning pipelines. It’s the most painless performance enhancement trick in my toolbox before I reach for concurrency (in which case I reach for joblib or Dask).

The one oddity I’ve noticed is that using tuples (including things like named tuples) often speeds up CPython by a lot, but even plain tuples can slow down pypy on the same code—in some cases pypy winds up slower than CPython.

In any case, I’m low key in love with pypy, even though I can’t use it for _everything_.



sort by: page size:

FWIW PyPy is signfificantly faster than CPython in a lot of tasks.

For algorithmic code PyPy can provide substantial speedups over CPython. I've used PyPy in code fingerprinting large bioinformatics files and seen big speedups. I've also tried porting a webapp processing JSON from CPython and seen no perceptible speedup.

I recently discovered that pypy3 can run all my day to day Python code. It has some issues with slightly different behavior from cpython when using threads but other than that I see a 4x speedup on most of my slowest pure python workloads (parsing large rdf files and reserializing them after computing a total order on all their nodes). Huge win for productivity.

I use pypy as a drop-in replacement for CPython for some small data crunching scripts of my hobby projects. Might not count as "real work", but getting "free" speed ups is very nice and I'm very grateful for the PyPy project for providing a performant alternative to CPython.

I was close to trying pypy on a production django deployment (which gets ~100k views a month), but given that the tiny AWS EC2 instance we're running it on is memory bound, the increased pypy memory usage made it impractical to do so.


I use it at work for a script that parses and analyzes some log files in an unusual format. Wrote a naive parser with a parsing combinator library. It was too slow to be usable with CPython. Tried PyPy and got a 50x speed increase (yes, 50 times faster). Very happy with the results, actually =)

Caveat: pypy is not _consistently_ faster than CPython. There are cases where it is slower, sometimes even by a large margin.

Have you used pypy recently? I've found that memory usage in particular is much better as of around 1.9, compared to previous releases. Still worse than CPython, for sure, but some of my code is around 10x faster under pypy (all depends on what I'm doing, though, for sure; this stuff is numerically heavy).

Yes, PyPy is fantastic for long-running processes that aren't primarily wrappers around C code. In my experience, the speedups you see in its benchmarks translate to the real world very well.

PyPy is great -- while I still use CPython for our more complex webapp and associated tools that have heavy dependencies on C-extensions; I increasingly use PyPy for the more mundane cpu/data heavy lifting I do. It's typical to get 2X the performance (comparable to some compiled languages) and still use much of our utility code, configs, etc.

I think using Pypy instead of CPython will give you several times the performance boost as any of this.

I didn't hear about PyPy before, but I think you're doing great work.

I would be interested in seeing benchmarks where PyPy is compared with more recent versions of CPython. https://www.pypy.org/ currently shows a comparison with CPython 3.7, but recent releases of CPython (3.11+) put a lot of effort into performance which is important to take into account.


I run it in a production environment (side project). I also use it locally when developing when necessary.

It really does speed up loops by 5x or so.

So when you're trying to say... test 100 million+ iterations of something, pypy will run that in something like 2 minutes versus cpython can take me 15 minutes.

Honestly it's an amazing performance gain for 0 effort, and I have yet to run into a limitation with it.


I found that PyPy sometimes has unexpected slowdowns. When we were porting from Python to PyPy on some offline processing tools, the most crazy one was building strings via += and sum(arrays,[]), which is much slower than cpython.

PyPy is actually very fast, ~10x faster than standard python.

PyPy is much faster than CPython.

Ok, but what CLI tools are you using where you would care about performance between CPython and PyPy?

Interestingly, cPython is twice as fast as PyPy for me.

Almost all Python code is written for CPython, and if you run into performance issues you tweak it until it runs on pypy.

I'm glad that pypy exists because it forces cpython to do something about performance, but for python overall it's a pretty niche tool.


pypy has quicker start up than CPython and also has a JIT. They're working on interesting memory optimisations too.
next

Legal | privacy