Reading Ruby Code: ROM - DSLs
11 Jun 2017
After a long hiatus, this is the fourth part of my on-going series on code reading, the beginning can be found here
In terms of code reading, this introduces a new wrinkle. When a well-formed DSL is used,
the defined language begins to blend in with the language’s own
keywords. While the above is definitely ruby code, it lacks the usual
def structure we find in most other
.rb files. In fact,
my editor even has a plugin that provides a whole new syntax mode for rspec files
on top of the ruby mode. In exchange for this shift in structure, the writer and
the reader get a “language” that maps more tightly with the concepts it is
modeling. When the writer of the code uses DSLs correctly, they reveal
intent and convey meaning more quickly. Used incorrectly, or at least non-judiciously,
they impose a burden on the reader to remember and hold onto concepts that may
be much clearer in the language’s standard vernacular.
ROM uses domain-specific languages in a few places, a notable example is when defining a relation’s
schema. ROM provides methods such as an
attribute method for specifying the
attributes of a schema (naturally):
This doesn’t look all that different from defining
attr_accessors on a plain old ruby object (PORO):
In fact, I’d suspect that the two classes above would have similar APIs (at least in terms of the read/writers for those methods). What the DSL version layers on top of that is a visual and syntactical similarity to what it is trying to model. Compare the above to the PostgreSQL definition of a user’s table:
The syntax of the ROM class above maps well to the syntax that the DB uses to describe its structured data. Also remember that ROM is trying to work with almost any type of data. Therefore, the syntactical similarity speaks more to the fact that these two languages are hitting on the same commonality, rather than ROM being influenced by PG database syntax. We can draw this point out further if we look at the definition of a struct in Ecto (a database wrapper in the elixir language): This is drawn from the ecto docs, and it is just a happy coincidence that they were using a User as exemplar.
While all slightly different, the above examples are speaking some type of lingua franca for modeling structured data. What this means for the reader is that if they are familiar with one of the above models, they don’t need to update their mental model too much to understand what is going on in another. An interesting counter point to this “common” language idea might be that this encourages “group think”. Perhaps pushing people towards homogeneous thinking and stifling innovation.
The DSL version also conveys two additional pieces of information over the PORO
example. First, the types of the each attribute. We can expect that
should return a string and
name= likely accepts a string. Secondly,
is some type of special feature of a
User. Since we are using a special syntax
(and class method), the writer has elevated its importance for this class. As a
reader we should home in on this as an area to focus on.
Users are not merely
some PORO data object, they also have this
schema property that conveys
additional functionality. If we don’t understand what a
schema is for a class,
it should be prioritized for further investigation.
Lets talk about a naive implementation of the
This would allow meet the basic requirement of storing the schema. However, it
suffers from the fact that
attribute is accessible outside of the
block, and even outside of the Relation class. This is problematic because the
relation may not need/want to expose this concept. Lets look at how ROM’s
implementation of the
Relation class adds the methods from the
ClassInterface module via
From that module we get the
schema method below:
Why place the class methods of Relation in a separate class? I suspect the
reasoning is to partition the Relation class from its direct API (instance
methods) from the internal workings of the class. In that way it is clear how
other objects interact with the Relation instance, while also clear how to
modify aspects of the class.
The schema method returns the
@schema instance variable if it is defined.
Alternatively, it uses an instance of Schema::DSL, to create a new schema
instance. This is memoization in a more verbose form from the
usual ruby idiom of:
Additionally it has the added advantage that a
nil value will keep the
memoized function from running.
When a block is passed into the method, the schema method initializes a DSL
instance with the block and other items to build a schema with the
instance_exec method executes the block from schema within the context of
the newly created DSL instance. This allows access to the instance methods
Once the block has all been run, the calls to the attributes have populated the
@attributes hash. So, the data needed to populate a
Schema instance is
present within this DSL instance:
This prepared data can be used to populate an instance with the
call method that we saw
earlier in the Relation’s
To review, the class methods of
Relation are partitioned into a separate module,
which defines a schema method which instantiates a Schema::DSL object with the
block passed into it. This passed block is executed within the context of that
instance, allowing the class method caller to define the attributes that
eventually are used to build a
Schema object. Finally, the call method on schema
DSL creates the schema from these attributes.
What are the advantages of this approach to designing a DSL?
- The DSL for schema is self-contained. If a future developer wanted to add a new schema method, they would know exactly where to go
- Executing the schema method block code within an instance is a shrewd move because it makes the code a bit simpler in that you don’t have to deal with what is a class method or too much variable passing
While the DSL pattern outlined above is repeated throughout the ROM codebase, in
this particular instance its worth noting that this approach serves another purpose.
Schema::DSL object is being used as a builder for preparing a
Schema object. Rather than asking the user of the Relation class to explicitly
build the object themselves, or learn a complicated set of configuration options
for the Schema class, the DSL provides an intuitive interface while hiding the
details of how to actually build a Schema object. This DSL-over-configuration
is an interesting pattern, and worth remembering in your own code writing.
DSLs are an important tool as a reader and writer of code. We reviewed ROM’s contained method of building a DSL. This showed how it could be used to great effect for modeling the underlying domain and providing an alternative configuration. As a reader of code, we should pay attention to where DSLs are used because the elevate the importance of that set of code. The above ROM sample is describing the data it is working with (a key part of the library itself), the RSpec example is creating a test (the goal of a test suite). Additionally, when it comes to DSLs we should focus on how the new “language” articulates the underlying concepts. If the goal of a DSL is to map more closely to the domain, what intent and concepts are trying to be conveyed with the DSL? I believe this question applies equally for both the reader and writer.
Thanks for reading, I look forward to more ruby code reading with you in the future.