Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

While this is something I very much appreciate about Python I think it's also true that structure needs to increase with the size of your codebase and the size of your team to avoid being driven mad. Python takes the "just apply discipline" approach which can absolutely work for lots of code but falls apart a bit with large heterogeneous teams.


sort by: page size:

I do love Python, but I often wish there was more rigor for some things. I currently work on a large project (>500K LOC) and have started to see it fall apart with a bigger team. Lack of enforced typing in function arguments, inability to create strict interfaces, inconsistency in standard library conventions, and package management would probably be my biggest gripes.

Like many people here, I've worked on large python deploys for most of my professional career and have, over time, come to see python's downsides more clearly.

I also wonder, thinking back over my experiences and reading others here, if we are mis-ordering how these things come to be. Maybe it's not that python codebases are, for a given line count, worse than other languages. Maybe it's that python allows teams to continue using practices that aren't suited to their current scale longer than other languages. The reason that we all have seen hulking, monstrous, nearly unworkable python code bases is that most other languages would have already collapsed under the mismatch between approach and desired outcome.

I think we often approach software engineering as a puzzle - where you have a set of inputs (a too-large codebase, for instance) and a question of how to a better state. But programming projects are path-dependent creatures. Huge codebases must develop over time - they don't spring out of the heads of developers fully formed. If you were a typed language and your engineers had a lower LOC output per-day, then of course your code base will be smaller. Is that better? You iterate more slowly, but the scale of your code is also more manageable. In my experience, the challenge of large python code bases comes from the context that you need to understand from the surrounding environment: what are all these objects, what are their objects, etc. Typed languages force you to carry around more context, so any given function is easier to read, but you can still write un-navigatable code.


Are you the only person working on it? How big are we talking here, because I’ve worked on a few Python projects that fall in the region of “a couple to a few thousand lines” both by myself and with a team. Working by myself was significantly easier but Python definitely hits a “tangled mess” point, regardless of how disciplined I was.

In comparison one of my personal projects is in Rust and while that’s currently maybe only a thousand lines it’s already significantly easier to deal with.


As an experienced Python dev... Python's "convenience" and "soeed of iteration" completely falls apart when you have more than half a dozen people working on the same codebase of varying development experience. You spend so much time digging yourself out of abuses of internal APIs.

I have personal projects that are bigger than 10KLOC and professional projects that are 10× that.

I'm not sure what you're getting at but code management in Python is really no harder than anything else. It can be clean and well delineated, or you can completely screw it up... But that's not Python's fault.


In a relatively terse language like Python, anything beyond a few screenfuls of code is already "large scale" development. It's unwise to keep it all in a single module.

Python doesnt really scale to large code bases, and refactoring is very hard.

No programming language design survives first contact with the programmers. This is also holds on project level if you are 100 persons working on the same project it's very easy to mess it up.

I've seen lots of bad python code, I don't think it's hard to do the wrong thing.


My biggest problem with Python is that projects larger than a given size tend to become unmaintainable rather quickly.

This is in large part because of the lack of strong typing and type annotations; if you aren't the only author or can't keep the codebase in your head, it takes real effort to figure out what a function does. Even the type annotations provided in a language like Java or C++ make this task much easier, not to mention languages with real strong type systems like Haskell.

That's not to say that building large systems in Python is impossible, but it takes a lot more effort and documentation that it would in other languages.


As with any language it's how you end up architecting your codebase. I do believe python lacks an authoritative resource for what is "good architecture" which leads to a lot of the code scaling problems.

Very much agree. I oversee a relatively small python codebase, but getting good quality, safe code out of the developers in a controlled way is really hard - there are so many ways in which the language just doesn't have enough power to serve the needs of more complex apps. We have massive amounts of linting, type hinting, code reviews spotting obvious errors that would be just invalid code in other languages.

It's like getting on a roller coaster without a seat belt or a guard rail. It's fun at first, and you will make it around the first few bends OK ... then get ready ...

Of course, with enormous discipline, skill and effort you can overcome all this. But it just leaves the question - really, is this the best tool for the job in the end? Especially when you are paying for it with horrifically bad performance and other limitations.


Python can go wrong for large codebases. There are exceptions, of course, but if I focus on the consensus: large code bases in a dynamic language have an anecdotally significant likelihood of becoming unmaintainable. It's a glass ceiling that almost everyone I've met, who has done serious work with a dynamic language, has encountered.

There are ways to deal with it, but it requires rigorous discipline, and resisting Python's dynamic siren call earlier on in the process.

Golang naturally guides you into a style of programming which scales. You're not fighting the language (or your own inclinations) to avoid getting entangled later on.

Everything has exceptions†, you can shoot yourself in the foot with anything, yes. But, reasonably, it's about the relative struggle to end up in a similar place. Which is higher for Golang initially, but higher for Python later on.

Not trying to turn this into a Python vs Go thread; they both have their place. But Python can definitely go wrong, in an area significant to many people.

† Except Go... T_T


I think the author is overselling this library. A lot of the problems they mention can be avoided with a consistent application of discipline.

You can decompose classes that become too big for their own good. You can design your software, layer abstractions intelligently etc. so that having to do such refactoring isn't a big issue.

Python is a language that demands an above average level of discipline compared to many other programming languages I have used, but only because it IMO leans strongly towards empowering the developer instead of restricting them.


I agree 100% with the statement that it tells us more about our engineering team practices, but the question I have is, how to get it better? Python provides rich built ins, which make it easy to compose a rich suite.

It appears that no one else has an issue with it in my group, but I’m the oddball, these guys come from a TS/JS background. They don’t appear to express concern with it.

Are there resources that you can recommend that I can use to get better?


This is an honest question, what's the size of projects you have written in Python? For me, once a project gets beyond some size, the aggravation only grows of debugging old code and adding new. Mostly, I see this is as a lack of compilation and types in Python. How do others overcome this and enjoy large Python projects?

Can you expand what you mean re: large projects in Python?

Biggest problem I had with a largish Python codebase was code navigation and knowing the structure of the data going into and out of methods/functions. I had to attach debuggers just to see what was going on more often than I think should be necessary in a professional setting.

Python doesn’t scale well beyond one file, one developer, one time write. Once a program needs maintaining, updating, expanding, debugging - the experience quickly deteriorates.

OK. Bear with me here. This might be overstating my case...

In some sense - there are no large projects. Only projects that have failed at being modular.

Maintainability involves keeping clear boundaries between functional units so that you can keep enough code in your head at any one time to reason about it.

If we assume that Python is capable of handling more code than you can keep in your head at any given time, then surely the problem is simply one of making sure your modules are nicely self-contained from each other.

next

Legal | privacy