Hmmm, I'm alway really surprised when people comment on threads like this.
How much time did you take to do this, I can read code, but I simply can't casually take a look at code like this and say, hey that's cool, hey that's interesting, I have to concentrate and go line by line, a more debugging state of mind.
There are programmers that can sort of skim read code which just amazes me.
The thing is, reading code is hard. Even well written code is hard to read and grok, the other kind can take days to absorb.
But it is very effective as a way of spotting certain kinds of bugs, while at the same time gaining a much deeper understanding of a codebase.
Reading code in a non trivial way is a very special skill that not many developers have, but that is very worthwhile to attain.
Whenever I come to a new codebase, I start by grabbing a pretty random bug and reading the code in that area. just flicking through, forcing myself to ask questions about it and finding out the answers.
actively engaging with code like that is an amazingly effective way of absorbing a codebase.
By reading code, I also meant debugging. They don't really explain how things work and their response is usually 'just read the fucking code!' so I presumed that I just am terrible at reading code.
I certainly _do not_ want to denigrate the idea of writing code that's designed to be read and studied - I think that's a great idea, and try to do it. But I've never been really able to 'read' a codebase, and I don't really understand what people who do this are doing.
My technique for getting to know a codebase is to look at one thing in particular, probably with the help of a debugger, tracing the call stack and how the data flows and changes. I get to know a whole project in sections, focusing on individual bits of functionality.
And if I'm honest, I rarely ever 'read' code this way either - I usually do it because I want to add a feature or improve performance, or squash a bug, and I stop when I understand enough to be fairly sure I'm doing things in a responsible manner - keeping with the architecture and not making what I leave when I'm done any more additionally complex than is required to accomplish my task.
I don't understand how people can 'read code' in any sort of a straightforward manner. The phrasing seems to suggest to me a linear 'reading' like a book or a technical manual. But you can't read code like that. If a project is of even moderate size, there are too many interweaved dependencies. Even a very well factored codebase is at best a top-down or bottom-up tree, and you (or, at least, I) often can't understand the trunk without understanding the disparate branches and leaves. I generally can't keep all that in my head without a task to focus on.
I think it's arguably even irresponsible to suggest to newcomers that they should 'read code' because it sets them up for failure when they try to do it and the complexity inevitably overwhelms them.
Do I misunderstand what people mean when they say "read code"?
The simple fact of the matter is that reading code is hard, maybe even impossible in the general case. You can understand code with some amount of effort, but it often boils down to an exercise in reverse engineering.
One thing this means is that in any substantial codebase you are never going to understand all of it. You will typically only have time to learn a fraction of the system, so if you are going to proactively explore the codebase, you will need to prioritize. You probably (but not necessarily) want to get a handle on the top-level architecture before digging deep anywhere.
My final piece of advice is that I personally find it impossible to understand just about any non-trivial piece of code without running it, and running it multiple times (1). Perhaps even many, many times. You can run under a debugger (single stepping or breakpoints) and this seems to work for many people. I still rely on print statements sprinkled through the code myself, adding and removing them as I run the code in question over and over again as my current point of interest moves from place to place in the code. This might sound scary, but it's not that different from the way you normally debug code.
(1) It's entirely possible that the person that wrote the code in the first place also ran it many, many times (testing each small change) as they wrote it. So it's perhaps not unreasonable that you yourself may need to run it many, many times in order to understand it later.
And it is surprising how much you can learn from reading code, the most humbling thing I learned from reading other people's code is that I'm not nearly as good a programmer as I thought I was.
I haven't read a program yet that didn't teach me something.
Reading programs is hard work though, it can tire you out pretty quickly, especially in the stage before things start clicking in to place and you can start to predict what's coming next based on the parts that you've already grokked.
I think that's a little different. Reading code almost never involves analyzing it to that degree. Usually you're just trying to figure out what part is broken or where you need to add some new functionality.
To me, reading new code is like separating a bunch of string that's been tangled together. What thread goes where, what's it's purpose, how is it connected to the other threads? It's a very active process.
Reading language is different, I move my eyes left to right, top to bottom, to consume someone else's ideas. It's a much more passive process than reading code.
Its not about understanding a line of code - anybody can do that. Its about absorbing kilo-lines of code and gaining an understanding of the whole thing. Without having to draw attention to every line.
I can read code by the page, hitting next page at about a 1Hz rate. If the code is not overheated. That means, avoid lots of syntactic bloat, keep it concise, keep it modular, with low branching. Just about what the OP says.
As a computer programmer, I've realized that one of the best way to improve myself is to read the code written by the masters of the art and try to emulate them. This is helped by the enormous amount of opensource code out there.
However, along the way, the very act of reading code has become a stumbling block in my journey.
I'm posting this because I want to get a perspective of how other programmers approach this problem. When faced with a huge chunk of code, how do you guys read it? do you read it line by line? do you guys put it into an IDE, look at the outline and simply jump into functions you are interested in?
Do you read through the "main" function first and then branch out to the utility functions or do the reverse where you read the utility functions and subroutines first and then figure out how they are put together?
please share with me some of the tips and tricks of code reading that you have discovered.
I try to read the code I need to use, but of course I never have the time to read as much as I should. It's really the best way to learn programming in general and to learn particular apis, as important as a musician listening to other musicians' music.
I feel much more confident using libraries whose source code I've already at least skimmed through. (Speed reading code and learning where to look for stuff later when you need it is a useful skill to develop.)
But reading static code isn't enough to trust it and be sure the comments and formatting aren't lying to you. Stepping through code in the debugger and looking at its runtime state and control flow is crucial to understanding what's really going on.
But the problem with reading code (especially code that you're not running in the debugger), is that you see what you think it's supposed to do, not what it's actually doing, especially when you're "skimming" over it as I like to do.
Occasionally I have the luxury of enough time to go into "study mode" and carefully read over code line by line (I've been reading the amazing npm packages in http://voxeljs.com recently, which is some amazing and beautiful JavaScript code that I recommend highly). But that is extremely tedious and exhausting, and uses so much energy and attention and blood sugar that I have to close my eyes and take little power naps to let my mind garbage collect.
And then I get these weird dreams where I'm thinking in terms of the new models and api's I've just learned, and sometimes wake up in a cold sweat screaming in the middle of the night. (I have sympathy for Theo and his neighbors he wakes up at night from nightmares about all the terrifying code he reads.) (So far no terrible nightmares about voxeljs, but a few claustrophobic underground minecraft flashbacks.)
Refactoring or rewriting or translating code to another language is a great way to force yourself to really understand some code. I've found some terrible bugs in my own code that way, that I totally overlooked before. And looking back, the reason the bugs were there was that I just saw what I intended to have written, instead of what I actually wrote.
And for those kinds of bugs, comments that describe the programmer's intent are actually very dangerous and misleading if they're not totally up to date and valid. Because the compiler does not check comments for errors!
I try to use lots of intermediate descriptive variable names (instead of complex nested expressions), lots of asserts and debug logs, and do things in small easy to understand and validate steps that you can single step through with the debugger. It's important to examine the runtime state of the program as well as the static source code. But that is hellishly hard to do with networking code in the kernel.
I also like to get away from the distractions of the keyboard and debugger, and slog my way through every line of the code, by printing it out on paper, going outside, sitting in the sun under a tree, and reading through every page one by one front to back, scribbling notes on the paper with a magic marker. That forces me to make my way all the way through the code before making any false assumptions, jumping around, and getting distracted. (ADHD Management Techniques 101!)
and surprisingly all of the code is of equal importance so you really need to review each line sequentially! Instead of finding stuff that you think is most likely to relate to what you're trying to figure out and debug from there. Wow I would like to see this marvel of engineering myself!
This is a long article and the first bits comes off fairly condescending so I stopped reading…but one thing caught my eye:
“If you are a programmer, try to find an answer by reading the source code”
This is basically a superpower at any large tech co if only because few people do it, even though we’re literally all capable of it. When I inherit or start interfacing with a new service, the first thing I do is checkout the code and peruse through it. Even just 30-45 minutes, for an experienced engineer, is enough to get a feel for the layout of things. Then when you have a question about “How does X service handle Y scenario” you can just go read the code and know the answer exactly.
I can’t tell you how many times I’ve had the answer to a question and get something like “dang rco8786 how do you know so much about how everything works” and invariably my answer is just “I read the code”.
I am very code literate, but depending on the size of the codebase and how much smoke and mirrors are used, I might take longer to grok how things are interconnected by reading the code than it would take me if someone gave an high level overview in a paragraph or two.
You've made some really good points about reading code in both of your posts. I'm fairly new to it, but I've started a daily habit of reading code everyday and I've realized this is an important skill I've omitted in the past.
Since most of the code I read (C# code) can be ran on my platform, I'll sometimes run into that rare piece of code that I can't believe it does what it says it does, in those cases I'll download the repo and debug it and many times be surprised to find there was a gap in my understanding. Reading code helps me find knowledge gaps I didn't even know were there.
How much time did you take to do this, I can read code, but I simply can't casually take a look at code like this and say, hey that's cool, hey that's interesting, I have to concentrate and go line by line, a more debugging state of mind.
There are programmers that can sort of skim read code which just amazes me.
reply