How to generate YAML from Ruby objects without type annotations

posted by Ayush Newatia
6 August, 2021



Serialising a Ruby object to YAML is dead simple. Ruby’s standard library has a yaml module built in which uses Psych under the hood. All we need to do is require 'yaml' and we’re good to go! But there’s a catch.

Let’s say we have Photo and Album classes where an Album contains an array of Photos; and we want to serialise Album to YAML. This is what such a script would look like:

require 'yaml'

class Photo
  attr_reader :file

  def initialize(file)
    @file = file
  end
end

class Album
  attr_accessor :name, :photos

  def initialize(name, photos)
    @name = name
    @photos = photos
  end
end

photos = [
  Photo.new("DSC_0001.jpg"),
  Photo.new("DSC_0002.jpg"),
  Photo.new("DSC_0003.jpg")
]

album = Album.new("Outdoors", photos)
puts album.to_yaml

The above script will print out:

--- !ruby/object:Album
name: Outdoors
photos:
- !ruby/object:Photo
  file: DSC_0001.jpg
- !ruby/object:Photo
  file: DSC_0002.jpg
- !ruby/object:Photo
  file: DSC_0003.jpg

What are all those class annotation type thingies?! They define the type of object that item was serialised from. When this YAML is deserialised, Ruby will try to deserialise each item into the object defined by its class annotation. Now if we only ever want to use this YAML in the context of our script or app, that’s fine. However if it needs to be portable, 😬.

To serialise an object to YAML without class annotations, we first need to convert it to a Hash and then to YAML.

Since this example is very simple, we could just hand write methods to specifically convert these objects into Hashes. But as our app grows, this will be unsustainable, so let’s take a look at how we could write a generic module to convert any object to a Hash.

module Hashify
  # Classes that include this module can exclude certain
  # instance variable from its hash representation by overriding
  # this method
  def ivars_excluded_from_hash
    []
  end

  def to_hash
    hash = {}
    excluded_ivars = ivars_excluded_from_hash

    # Iterate over all the instance variables and store their
    # names and values in a hash
    instance_variables.each do |var|
      next if excluded_ivars.include? var.to_s

      value = instance_variable_get(var)
      value = value.map(&:to_hash) if value.is_a? Array

      hash[var.to_s.delete("@")] = value
    end

    return hash
  end
end

We can now include the above module in our Photo and Album classes and serialise them to YAML without the class annotations!

class Photo
  include Hashify

  # ...
end

class Album
  include Hashify

  # ...
end

photos = [
  Photo.new("DSC_0001.jpg"),
  Photo.new("DSC_0002.jpg"),
  Photo.new("DSC_0003.jpg")
]

album = Album.new("Outdoors", photos)
puts album.to_hash.to_yaml

The script will now output:

---
name: Outdoors
photos:
- file: DSC_0001.jpg
- file: DSC_0002.jpg
- file: DSC_0003.jpg

This is just plain YAML and can be used in any app written using any language with a YAML serialiser!