Reading Ruby Code: ROM - Exploration
This is the third part of my on-going series on code reading, the beginning can be found here
In the first two posts on code reading we got the setup and presented an overview of the Ruby Object Mapper (ROM). With that out of the way, lets dig in to the real code reading and begin exploring. To start, lets focus on the container method in the example code.
For a refresher, here it is within the example:
rom = ROM.container(:sql, 'sqlite::memory') do |conf|
conf.default.create_table(:users) do
primary_key :id
column :name, String, null: false
column :email, String, null: false
end
end
Here the container
method is called on the ROM
module with two arguments and
a block. An object is passed to the block, and we configure options on that
object. This is a
fairly common pattern, where a
singleton class provides a method for configuring itself via block or normal
method chains, for example:
Client.configuration.protocol = 'https'
Client.configuration.domain = 'test.com'
# or
Client.configure do |config|
config.protocol = 'https'
config.domain = 'test.com'
end
In both cases, a configuration object is being stored on the singleton class. In
the block case, this object is yielded to the block, in the method-chain case it
is accessed directly. The block pattern avoids the need to make a chained method
call repetitively and also segregates the configuration into a block. This is
both easier to write and read. ROM is following this pattern to an extent, but we
will see that the configuration happening within the container
method is much
more complex. To find the container method, a quick grep reveals the method
within the create_container.rb
file:
lib/rom/create_container.rb
This code sample is by rom-rb, you can view the full file here.
57 def self.container(*args, &block)
58 InlineCreateContainer.new(*args, &block).container
59 end
Which is just a wrapper around the InlineCreateContainer
class:
lib/rom/create_container.rb
This code sample is by rom-rb, you can view the full file here.
38 class InlineCreateContainer < CreateContainer
39 def initialize(*args, &block)
40 case args.first
41 when Configuration
42 environment = args.first.environment
43 setup = args.first.setup
44 when Environment
45 environment = args.first
46 setup = args[1]
47 else
48 configuration = Configuration.new(*args, &block)
49 environment = configuration.environment
50 setup = configuration.setup
51 end
52 53 super(environment, setup)
54 end
55 end
The highlighted lines above show the path that our example code, with its
arguments and block, would take. Whats interesting here is the flexibility of
this method’s inputs. It can accept a Configuration
, an Environment
(and
Setup
), or it will build a Configuration
object using its args and block. At
the bottom of the method, we see that we need an environment and setup.
Configuration objects have both an Environment
and Setup
, so we can build up
to a configuration to get these required objects:
This is a good example of the robustness principle or Postel’s law:
Be conservative in what you do, be liberal in what you accept from others
Here a container can be made using a variety of different inputs, but will
always (with valid inputs) return a container. The actual building of the
Configuration
object from args and a block is isolated within the
configuration class:
lib/rom/configuration.rb
This code sample is by rom-rb, you can view the full file here.
21 def initialize(*args, &block)
22 @environment = Environment.new(*args)
23 @setup = Setup.new
24 25 block.call(self) unless block.nil?
26 end
The args are used to build the Environment
object that will eventually be a
part of the Configuration
object. That configuration object is yielded to the
passed block allowing it to be modified from within that block.
With the environment
and setup
in hand, we use those classes to build a
Finalize
class and run it:
lib/rom/create_container.rb
This code sample is by rom-rb, you can view the full file here.
10 def initialize(environment, setup)
11 @container = finalize(environment, setup)
12 end
13
14 private
15
16 def finalize(environment, setup)
17 environment.configure do |config|
18 environment.gateways.each_key do |key|
19 gateway_config = config.gateways[key]
20 gateway_config.infer_relations = true unless gateway_config.key?(:infer_relations)
21 end
22 end
23 24 finalize = Finalize.new(
25 gateways: environment.gateways,
26 gateway_map: environment.gateways_map,
27 relation_classes: setup.relation_classes,
28 command_classes: setup.command_classes,
29 mappers: setup.mapper_classes,
30 plugins: setup.plugins,
31 config: environment.config.dup.freeze
32 )
33 34 finalize.run!
35 end
It is the Finalize
class that is responsible for building up the container object:
lib/rom/setup/finalize.rb
This code sample is by rom-rb, you can view the full file here.
16 module ROM
17 # This giant builds an container using defined classes for core parts of ROM
18 #
19 # It is used by the setup object after it's done gathering class definitions
20 #
21 # @private
22 class Finalize
23 attr_reader :gateways, :repo_adapter, :datasets, :gateway_map,
24 :relation_classes, :mapper_classes, :mapper_objects, :command_classes, :plugins, :config
Just to keep track, when ROM.container
is original called here is the process:
The usage of three classes to perform one operation could be viewed in several ways. Some may argue that it is unnecessarily complex, that it is Object Oriented design run amok. Or that it is too hard to follow through multiple files. I disagree with these characterizations because:
- The container is a central piece of ROM, and so having flexible and varied ways to build one is useful
- Classes are well named (
InlineCreateContainer
explains what it does) therefore following becomes easier - Functionality is well segerated between classes
However, it does require navigating between three files (create_container, configuration, and finalize) and about twice that many methods to piece together how a container is built. As a code reader, this might be more difficult than scrolling through a single file. From a practical perspective, as I am following a progression, I open each new file as a separate “buffer” (in some editors this might be called a window or tab). In that way I can flip quickly between the files and follow the progress.
One other trick to try for not to losing focus on what we want to know is:
When following an operation through several classes or methods, focus on the beginning and end of methods
For example, the InlineCreateContainer
initializer has 13 lines in it, but
the key to the method is that it takes an environment and setup object up to its
super class. It accepts a lot, but setup
and environment
are moving on. This
is revealed on the last line of the method. So, while we may want to know what
environment
and setup
are, we probably should continue on the
CreateContainer
superclass to follow the progression. If the local variables
are well named, we should also be able to reason as to what they are and what
class they might come from.
A hard to find method
After resolving the overall creation of the container, the block configuration
of the container still needs to be examined. In the example, the
configuration object is yielded to the block. A default
method is called,
revealing something that responds to the create_table
method. default
however does not exist on the Configuration class, also
its not easily revealed through a search of the project (“default” the word is
fairly common).
One issue with dynamic interpreted languages such as Ruby is that methods can
come from several sources. Aside from standard definition, they can also come
from mixins, meta-programming, etc. Moreover, classes can be re-opened and added
to at later points. While this offers a great deal of flexibility as a
developer, it may make finding a given method in ruby code difficult. Luckily,
there is the aptly name method
method and
corresponding class. Lets use that method to find our mysterious
default
method. First you need an instance of the class that you want the
method for, so at this point I would drop a debuggerpry is my debugger of choice, byebug is another option. in my example code and break here:
rom = ROM.container(:sql, 'sqlite::memory') do |conf|
require 'pry'; binding.pry
conf.default.create_table(:users) do
Running the example, I now have a Configuration
object (conf
) that I can
interrogate and get its methodsYou could of course build the object in a small script or test, but this is much
faster for the problem at hand. From looking at the source I know the use
method exists,
so lets try that:
[3] pry(main)> conf.method(:use).source_location
=> ["/Users/michael/projects/ruby/rom-rb-exploration/vendor/ruby/2.3.0/gems/rom-2.0.0/lib/rom/configuration.rb", 34]
Calling source_location
reveals the line and file where the method is defined.
This is very handy when you have multiple gems/repositories in play. However,
when we call method
for the default
method we get:
[4] pry(main)> conf.method(:default)
NameError: undefined method `default' for class `#<Class:#<ROM::Configuration:0x007f8896277f88>>'
What happened? Ruby is telling us that the default
is undefined on the
Configuration
class. The example code runs fine though, so something else must
be in play. This is, of course, method_missing
a useful meta-programming
feature to provide dynamic methods.
When using
method
and a known defined method is “undefined” look for amethod_missing
definition
Indeed, this is the case for the configuration class:
lib/rom/configuration.rb
This code sample is by rom-rb, you can view the full file here.
62 def method_missing(name, *)
63 gateways.fetch(name) { super }
64 end
If a method is undefined, this class will try to find that method name as a key
in the hash-like gateways
object. Failing that, the normal ruby method_missing
behavior continues. So, when we call default
, the value for the default key
within gateways
is returned. Since we still have a debugger open, lets look at
the gateways
object:
[5] pry(main)> conf.gateways
=> {:default=>
#<ROM::SQL::Gateway:0x007f8896275fa8
@connection=#<Sequel::SQLite::Database: "sqlite::memory">,
@migrator=#<ROM::SQL::Migration::Migrator:0x007f88949ea880 @connection=#<Sequel::SQLite::Database: "sqlite::memory">, @options={:path=>"db/migrate"}, @path="db/migrate">,
@options={:migrator=>#<ROM::SQL::Migration::Migrator:0x007f88949ea880 @connection=#<Sequel::SQLite::Database: "sqlite::memory">, @options={:path=>"db/migrate"}, @path="db/migrate">}>}
As expected, gateways
is a hash with the values being Gateway objects. In the
example, default
is a SQL::Gateway
because that is what we passed as an
argument to the original container. ROM containers are not limited to a single
gateway/adapter, but when there is just one it will become the default
. We can
see how this adapter switching/differentiation happens by looking at the Gateway
class itselfI’m skipping over tracing back into the container building to show this
piece. If interested refer back to the initializer of the container which
initializes an Environment object.
lib/rom/gateway.rb
This code sample is by rom-rb, you can view the full file here.
97 adapter = ROM.adapters.fetch(type) {
98 begin
99 require "rom/#{type}"
100 rescue LoadError
101 raise AdapterLoadError, "Failed to load adapter rom/#{type}"
102 end
103 104 ROM.adapters.fetch(type)
105 }
This is a clever usage of Hash’s fetch to provide some dynamism
and on-demand loading. fetch
first looks for the passed adapter key in the
adapters
hash. If the key is found, the value will be returned. If not found, the
block in lines 98-104 will be executed. In the block, an attempt is made to
require
/load the missing adapter. Presumably, requiring the adapter will add
it to this hash because the fetch method is called again on 104, and at this
point it will either return the adapter or throw a KeyError
. In short, this
code will attempt to load an unloaded adapter, and failing that it throw an errorEither because the adapter doesn’t exist/can’t be loaded or because the adapter
wasn’t added properly (KeyError).. We are assured by the end of this method that we either have
an adapter like object or have thrown an error. To confirm this we can see that
actual set of adapters is stored as a hash on the ROM
module as an
attribute:
lib/rom/global.rb
This code sample is by rom-rb, you can view the full file here.
19 # An internal adapter identifier => adapter module map used by setup
20 #
21 # @return [Hash<Symbol=>Module>]
22 #
23 # @api private
24 attr_reader :adapters
To add an adapter, you “register” it with register_adapter
method:
lib/rom/global.rb
This code sample is by rom-rb, you can view the full file here.
53 def register_adapter(identifier, adapter)
54 adapters[identifier] = adapter
55 self
56 end
We can see this in action in the gem rom-sql
gem:
lib/rom/sql.rb
This code sample is by rom-rb, you can view the full file here.
23 ROM.register_adapter(:sql, ROM::SQL)
So, requiring the file rom/sql
as we might in line 99 of the gatway code will add the ROM::SQL
to the list of adapters.
This is our first encounter with the plug-able/modular nature of ROM. ROM deals with data persistence, and as such wants to support a wide range of databases and persistence formatsROM is not confined to databases such as mongo or PostgreSQL, for example, you can use CSVs as a persistence format in ROM.. In addition to performing a standard set of business logic, ROM also wants to leave the door open to expansion to new data storage formats.
One approach to solve this problem is “one gem to rule them all”, i.e.
rom-rb/rom
holds everything from PostgreSQL code to CSV code. No sub-gems or
plugins, just ROM. This is problematic for a few reasons, the most tangible
being that every time the maintainer needs to fix anything, the gem must be
bumped and pushed out. So if you are a happy CSV user, you are pushed to upgrade
every time a PostgreSQL fix is pushed out. It also becomes a burden on the
maintainers because the lines between sections of code are not as clear.
Given these issues, a more robust design choice is creating a plugin like architecture where the end user can choose the adapters they want to use/load. The user is not required to pull in code that is not neededThus a CSV user just loads the rom/csv gem and can happily ignore all updates to rom/sql and vice versa. This leads to its own trade offs and maintenance burdens, but its likely the better choice for this situation.
The other noteworthy design aspect is the clear separation of concerns in the
plugin architecture. ROM
the module provides a structure/harness for using
individual adapters. The individual adapter only has to register itself with the
main module. What doesn’t happen, is the listing of individual adapters in
advance with the ROM module. Additionally, the individual adapter have only a
single method call on the module. They aren’t directly accessing the adapters
hash or anything of that nature. In fact, the adapter doesn’t even know that
adapters
exists, much less that it is a hash. Thus, the implementation of the
adapters storage can change at will as long as the method signature of
register_adapter
stays the same.
Takeaways
In this blog post we explored how the container
method works and more
generally the Container
class. I demonstrated how I follow code through
multiple classes, and using the method
call to find a dynamic method. For code
reading in general though I hope I have demonstrated that there are things to be
gained from code reading, no matter what your level of development:
For the absolute beginner… simply being able to follow and understand the above code would be considered a major success
For the intermediate rubyist… clever usage of nested fetch to achieve auto-loading behavior and writing methods with flexible inputs
For the more seasoned developer… interesting architectural patterns abound
For example, I personally have never written a serious plugin type architecture. While I have implemented a register-like interface, the architecture explored herein is much more developed. If I were to implement something like this down the road, I’d have a vague idea of what I’d want to do and could always refer back to this or other open source examples. This is the power of code reading: I’ve given myself a design shortcut for the future by simply being exposed to ideas in the present.
We will continue with the exploration of ROM in our next article. Thanks for reading!