Introducing Factory Bakery

By Mohammed A.
Apr 04, 2021 • 7 mins read
Introducing Factory Bakery

Fixtures in Rails are the default way of filling the test database with random data. There are other solutions like factory_bot that use a ruby API for generating the fake objects. However, I needed to go a step further. It should uses column definitions to generate the fake data automatically just like model-bakery for Django. I couldn’t find anything on Rails, so I made one! Meet factory_bakery!

How we got there?

We are creating a new Rails app that uses the same database that originally created with Django app. We didn’t have the luxury of making the factories incrementally when we design the table. Instead, it was too late and so much work to do if we want to make fixtures for all models.

I’ve used model-bakery on Django to generate fake data. And I loved it! I never found anything similar in Rails world. Most likely due to the fact that Django models are tightly coupled with the database, where you can find schema definitions. Rails takes a different approach, the models are empty by default, and the schema is defined in the migrations.

How Was the Solution?

The public API exposes 2 global functions (maybe not good?)—bake and bake!, the first just fills the object, the second will persist it to the database.

 bake(User, email: '[email protected]', password: nil)
 # OR
 bake!(User, email: '[email protected]', password: nil) # Persist to DB

The only time you need to provide attribute value is when you test against that value or have a relationship with other models.

Some use cases from our codebase (we are using RSpec):

let!(:group) { bake!(Group) }
let!(:cost_code) { bake!(CostCode, group: group) }

The group in the cost code should be assigned explicitly. But it’s still fun to use!

It also has some plans for extending the generators. Because it’s a Railitie, this can be done by appending the array in the config

# in development.rb, application.rb or test.rb
config.factory_bakery.generators << ... # some generator

Curious about how this implemented? Let’s dive deep into the code!

Diving Deep Into ActiveRecord

ActiveRecord connects to the database, therefore should be some metadata about the columns and the attributes. The keywords I was looking for were: attributes, columns, or fields.

Then I found it! It was attribute_types which returns a hash of the string key as the attribute name and the value which is an object of ActiveModel specific type that has metadata regarding the attribute and what type of Ruby object it can have.

Make use of attribute_types!

Now I know what I should use. The next step is to map those attribute types to values that can be assigned to fields.

I used a simple map function to map the attributes to PORO that will have the metadata—because I don’t want to depend on the ActiveRecord nor ActiveModel type. I called it AttributeDescriptor, it looks like the following:

AttributeDescriptor = Struct.new(:name, :type, :limit, :range, :choices, :unique)

It handles the enum properties as well, there will be at choices. Hence, it will not generate an invalid enum value.

However, I faced few challenges.

Challenge #1: Accessing Instance Fields

The data type of ActiveModel that has the metadata of the attribute doesn’t have a public message for some of the requirements, for example, range and mapping. So, I had to hack through it

# range
attr.instance_variable_defined?(:@range) ? attr.instance_variable_get(:@range) : nil

# mapping
attr.instance_variable_defined?(:@mapping) ? attr.instance_variable_get(:@mapping).values : nil

With the hope that it will not break in the foreseeable future.

Challenge #2: Types Needed Manual Mappings

Although the types provided by ActiveRecord has some useful metadata, the type message they respond to returns a Symbol that doesn’t always map directly to a Ruby type. For example, :big_integer, there’s no type that called BigInteger.

Therefore, I had to do the mappings manually for all the default types are registered to ActiveModel types. The mapping sample was as following:

register(:integer) do |descriptor|
choices_or_else(descriptor) do
    FFaker.rand(descriptor.limit).to_i
  end
end
register(:big_integer, &registry[:integer])

register(:float) do |descriptor|
  choices_or_else(descriptor) do
    FFaker.rand(descriptor.limit).to_f
  end
end
register(:decimal, &registry[:float])

You can see some types can share mappings with other classes.

I used FFaker gem to generate the fake data.

If you wonder where the register and choices_or_else are coming from, they are coming from an included module I made for this specific case.

Challenge #3: Multiple Rails Versions

I didn’t want this library to work only on the last Rails version (6.1 at the time of writing). There’re many people having their Rails app not up to date. And they should be able to use this gem.

Luckily, I found Appraisal by ThoughtBot. It has a straightforward interface to run multiple Rails versions by creating separate Gemfile’s.

Challenge #4: Incompatibility Between Rails Versions

I’ve faced one incompatibility issue when I was writing tests, it was with the DateTime and Time data types.

For Rails before 6.1, the data types of :datetime and :time are Time and ActiveRecord::Type::Time::Value respectively. While both of them are DateTime Rails 6.1.

To test against this edge case, I used some rubygems APIs. It looks like this:

if RAILS_GEM.version < Gem::Version.new(6.1)
  expect(dummy.datetime).to be_a Time
  expect(dummy.time).to be_a ActiveRecord::Type::Time::Value
else
  expect(dummy.datetime).to be_a DateTime
  expect(dummy.time).to be_a DateTime
end

I’ve never faced any other issues yet.

Challenge #5: Creating Tables Without Creating Migrations

Having multiple Rails versions will need to have multiple versions of migrations and maybe multiple versions of an example Rails app, which is tedious to maintain.

Instead, I thought of some way to generate Rails apps on the fly along with the migrations and the models that needed for tests. Or find the API for modifying the database schema that used by Rails.

Fortunately, Thought Bot solved the same problem, they created a macro for creating models on the fly along with the database tables. Which was what I needed. It was amazing! so, I used it!

Other Challenges

Other challenges which I haven’t solved yet. Including generating data for the relations. Also, getting more metadata from the validations (for now, you should use save(validate: false) for some edge cases, like maximum length).

Conclusion

We’ve seen some details of why and how this gem is implemented. It needs improvement, but it’s fun to use at this stage. I would like to hear from you when you use it!

Again, the multiple stack experience is beneficial. You can always bring brilliant ideas from one stack to another. This is the beauty of programming!