This is the second part of my on-going series on code reading, the first post can be found here.
In the world of testing methodolgies, there is the concept of “outside-in” testing. The higher level feature tests (how the user will interact with the application) are written first. These higher level tests are used to drive out and guide the development of more granular integration and unit tests, and ultimately the corresponding code. In a similar way, to begin code reading, I prefer to understand the codebase from the perspective of a user. I’d wager that for most developers, the first code you read is not the source code. Rather, it is the usage examples that show you what the library delivers to you as a consumer.
Continuing with our example library ROM.rb, let us orient ourselves to the library as a potential user. If nothing else allows us to break through the blank-page-syndrome of code reading, and find a place to start. Once, you are oriented and understand the general of structure of a code-base, it becomes much easier to look at individual pieces of code and understand how they fit into the larger structure. My general process for orienting myself is to:
- Understand the purpose of the project
- Build or find a working usage example
- Reason about the key parts of the example I say “reason” here, because I will be guessing about how I think the system will work. As I read more, I’ll be checking that against my assumptions. Also, my guessing will likely lead to questions about how things work.
With those three items complete, I feel comfortable enought to begin looking at specific pieces of the system and/or tackling a specific problem.
This feels like an obvious one, and hopefully it is. In order to understand how a project “works” you need to understand what it is trying to do. This understanding should come from the documentation, or more likely, the Github README. In the case of a new job, hopefully it has been well explained to you what the project’s goal is before you look at a line of code. On the opposite side of the fence (for developers and project leads), realize how valuable a statement of purpose is for your project. This is the first thing a new developer, or a user, will need to know about your project. If this isn’t available in the above-the-fold of your project, something is wrong.
In the case of ROM:
While that gives us an idea, of what ROM’s purpose is, to really understand it more you have to dive into the documentation. Luckily ROM includes several introductory pages, among which is a short introductory paragraph as to its goals. For those who do not want to read the introduction, ROM offers a way to manipulate, transform, and store data. It favors de-coupling and separation of the persistence layer from business logic and offers an alternative to the widely used Active Record pattern Note: AR Pattern, not necessarily the library. .
Having working example enables us to ground our understanding in real world usage. I cobbled this example together based on the documentation and the quick start guide:
This example sets up an in-memory SQLite database and adds one record to it. It then reads the record back and prints a field from it. One nice thing about this example is that what it is doing is fairly obvious In my opinion, I may have some knowledge from reading the documentation while assembling it. But for the most part I think the example’s goal can be inferred directly from the code. . Despite its brevity, this example hits 3 major libraries in ROM (lines 1-3) and demonstrates the major functionality of ROM. As such it is a perfect example for our exercise at hand.
Being able to make this example was a direct by-product of the documentation. If you are entering a closed-source, proprietary project, this type of documentation may not exist. In that case, I would look first at automated tests, failing that I would turn to co-workers or other maintainers. Failing that, you have to begin hunting for what appears to be a logical starting point based on the framework being used. Failing that, you have to read the code to find a logical starting place, but at that point you should be writing a few automatic integration tests to get yourself some coverage.
Understanding the key parts of the example
With a good example in hand, we can begin to find the key structural elements of the project. A worthwhile rule here:
The importance of an item likely correlates well with its place in the loose hierarchy of:
- Language Primitive Ruby lacks a strong concept of primitives since everything is an object. However, you can still roughly define this based on the analogues that would be primitives in most other languages (Integers, String, Arrays, etc.).
Looking at the above example, we have a module
ROM, a class
that gets sub-classed as
UserRepo. We create two key objects, one called
user_repo. Before moving onto looking at major method
calls, there is another useful rule:
Look for configuration code, configuration often hints at important parts of the system and gives clues to desired function
For example, if you know that SQLite is a type of database, you can
guess that on line 5 we are configuring an SQLite database. Further, you
can guess that
container setups up the connection to a database, and most likely
it supports other types of databases. This hints at the fact that
likely an important method to understand. The example goes onto define a block
that will yield a
conf object where more configuration happens.
Even disregarding the configuration part of
container it still sticks out as
a major method by the fact that it is being called on the root module
Further, its result is passed into the “main” object in the example, the
UserRepo initializer (line 21). Other interesting methods are the
call on (line 23) which does the work of creating the row in the DB. Related to
that is the
commands method in UserRepo which seems to define the
Lines 6-10 appear to create a table with two columns. This looks similar to a
Rails database migration in its vernacular. This doesn’t pique my interest,
which is actually a good thing. Since I can guess at its behavior I don’t have to
prioritize it for investigation. Lines 21 onwards are just exercising the things built
above, they are only interesting in so far as showing the importance of the
rom object and the
Now onto the stuff I straight don’t understand at first glance. Here’s a list of thoughts:
- Hash accessing during sub-classing in line 13 is curious and not very idiomatic in Ruby, definitely worth understanding.
- Where does that
usersmethod come from on line 17? Likely related to 1
- Where do we get that nice DB query syntax for line 17?
Given all of the above, I can reasonably come up with a list of things I want to look for:
- Method definitions
This simple process of identifying important items and then key method calls is a solid way of successful code reading. It can easily become a recursive process where once you enter a new class file, you repeat the process and add new key classes and methods to your list. Keep working through your list until you’re finished. I won’t necessarily follow this recursive approach for this blog series, but it is a worthwhile approach for a codebase you will live with for a long period (e.g. a new job).
Finding what you want
Finding files in a project is directly related to how well a project sticks to a
convention. The convention does not even have to be adopted outside of the
project, it just needs to be consistent. Luckily in the case of
ROM, the gems
stick to patterns that are widely adopted throughout ruby. Failing a
strong convention, you can use grepping to find the class you would like.
also has a plugin based architecture that we will explore more in a future
article. This means that code is often separated across several repositories,
but the setup I outlined in the first post allows us to traverse
those repos easily.
In the next article we will seek to answer the questions in this post through further code reading. Thanks for reading!