The problem with null

In programming languages that have the concept of null, any value in a system can be (for example) a number or null; a string or null; an object or null.

Null (or nil in Ruby, undefined in Javascript, etc.) presents a problem for us when working with untyped sources of data. If we’re not careful null values can spread throughout our software, causing errors in remote code locations that expect non-null values.

Let’s examine how null values can enter a system and what to do about it. In the following code snippet, we will try to fetch the “Q Score” of a coffee from an array of coffees:

coffees = [
  {
    name: 'Tanzania AAA',
    roast_date: '2016-09-01',
    flavour: 'very nice',
    metadata: {
      altitude: 1600,
      q_score: 83.1
    }
  }
]

coffees[0][:metadatum][:q_score]
# => NoMethodError: undefined method `[]' for nil:NilClass

# ~> NoMethodError
# ~> undefined method `[]' for nil:NilClass
# ~>
# ~> /var/folders/v5/nl_spw2j3qj0_scbfy6gv_sm0000gn/T/seeing_is_believing_temp_dir20160913-70485-je99qk/program.rb:13:in `<main>'

That didn’t work because we misspelled the :metadata key.

One way to avoid this sort of exception is to explicitly check for keys in the hash, in a defensive style of coding:

coffees = [
  {
    name: 'Tanzania AAA',
    roast_date: '2016-09-01',
    flavour: 'very nice',
    metadata: {
      altitude: 1600,
      q_score: 83.1
    }
  }
]

if coffees[0].key?(:metadata)
  metadatum = coffees[0][:metadata]
  if metadatum.key?(:q_score)
    metadatum[:p_score]
  end
end # => nil

But that’s a lot of code, suffers from quite a bit of duplication and doesn’t really solve the problem (what if we accidentally typo the key when we access the element, but type it correctly when writing the conditional? As we did above with :q_score / :p_score).

Ruby has a Hash#fetch method which is useful in these situations:

coffees = [
  {
    name: 'Tanzania AAA',
    roast_date: '2016-09-01',
    flavour: 'very nice',
    metadata: {
      altitude: 1600,
      q_score: 83.2
    }
  }
]

coffees[0].fetch(:metadatum).fetch(:q_score)
# => KeyError: key not found: :metadatum

# ~> KeyError
# ~> key not found: :metadatum
# ~>
# ~> /var/folders/v5/nl_spw2j3qj0_scbfy6gv_sm0000gn/T/seeing_is_believing_temp_dir20160913-70032-15zv3x0/program.rb:13:in `fetch'
# ~> /var/folders/v5/nl_spw2j3qj0_scbfy6gv_sm0000gn/T/seeing_is_believing_temp_dir20160913-70032-15zv3x0/program.rb:13:in `<main>'

The error we get from that code tells us exactly what’s wrong.

Tolerating nulls

But Ruby also has Array#dig, Hash#dig and Struct#dig, which we could use like this:

coffees = [
  {
    name: 'Tanzania AAA',
    roast_date: '2016-09-01',
    flavour: 'very nice',
    metadata: {
      altitude: 1600,
      q_score: 83.2
    }
  }
]

coffees.dig(1, :metadatum, :q_score) # => nil

We didn’t get an error for that, just nil. Depending on the circumstances that might be ok. We also can’t tell whether the value at [0][:metadatum][:q_score] equals nil or whether one of those steps failed.

Ruby’s dig method (introduced in Ruby 2.3) appears to be so named because you are digging for data with uncertainty as to whether it’s all there. For quickly prototyping something, this is no doubt appropriate, but in a production system we’d like more visibility into errors so we can fix our code or investigate API changes.

This leaves us with 2 choices - fetch and dig - fetch gives us specific errors when keys are missing from Hashes, but is rather verbose when reaching deep into data structures (not uncommon when working with 3rd party data); dig allows us to express the problem quite simply, with a sequence of keys or indexes, but gives us no visibility of missing keys or indexes.

Nulls all the way down

The examples so far have been short so the error has been easy to spot, we get nil, when we expected a value. The problem of null causes havoc far beyond the code that introduces the null. To see how, let’s examine a larger code sample:

all_coffees = [
  {
    name: 'Tanzania AAA',
    roast_date: '2016-09-01',
    flavour: 'very nice',
    metadata: {
      altitude: 1600,
      q_score: 83.2
    }
  },
  {
    name: 'Brazil AAA',
    roast_date: '2016-09-02',
    flavour: 'chocolate and broken dreams',
    metadata: {
      altitude: 1500,
      q_score: 88.1
    }
  },
  {
    name: 'Decaf whatever',
    roast_date: '2016-09-03',
    flavour: 'despair and cherries',
    metadata: {
      altitude: 1450,
      q_score: 72.1
    }
  }
]

def coffees_roasted_at(coffees, date)
  coffees.select { |coffee| coffee[:roasted_at_date] == date }
end

def lowest_altitude_coffee(coffees)
  coffees.sort_by { |coffee| coffee[:metadata][:altitude] }.first
end

lowest_altitude_coffee(
  coffees_roasted_at(all_coffees, '2016-09-02')
)[:name] # => NoMethodError: undefined method `[]' for nil:NilClass

# ~> NoMethodError
# ~> undefined method `[]' for nil:NilClass
# ~>
# ~> /var/folders/3l/1lt4f3dx4658n0qw4lpx_6t00000gn/T/seeing_is_believing_temp_dir20160913-59940-dtfhjy/program.rb:39:in `<main>'

The last line of that program has raised a NoMethodError on nil (we’re calling [:name] on nil), but where did that nil come from? Now we have to hunt up the stack of data and method calls to find what introduced a null value. Replacing any call to Hash#[] with Hash#fetch will solve most of these issues and result in stacktraces that point to the introduction of the null value.

Null objects

What happens when you call a method on nil? In Ruby, you get a NoMethodError, which you would expect (nil is an instance of NilClass and that likely does not implement the method you want). One way to deal with the problem of nulls, instead of sprinkling null checks throughout your code, is to use the Null Object Pattern. The fundamental idea behind the Null Object pattern is to create Null classes that implement the same interface as your non-null objects do. An example of that would be:

Coffee = Struct.new(:name, :roast_date, :flavour, :metadata) do
  def self.from_hash(attrs:)
    coffee = new
    coffee.members.each do |key|
      coffee[key] = attrs.fetch(key)
    end
    coffee
  end

  def altitude
    metadata.fetch(:altitude)
  end

  def q_score
    metadata.fetch(:q_score)
  end
end

class NullCoffee
  attr_reader :name, :roast_date, :flavour, :altitude, :q_score
end

def coffee_attrs
  [
    {
      name: 'Tanzania AAA',
      roast_date: '2016-09-01',
      flavour: 'very nice',
      metadata: {
        altitude: 1600,
        q_score: 83.2
      }
    },
    {
      name: 'Brazil AAA',
      roast_date: '2016-09-02',
      flavour: 'chocolate and broken dreams',
      metadata: {
        altitude: 1500,
        q_score: 88.1
      }
    },
    nil
  ]
end

coffee_attrs.map(&:class)
# => [Hash, Hash, NilClass]

def coffees
  coffee_attrs.map { |coffee_attributes|
    if coffee_attributes
      Coffee.from_hash(attrs: coffee_attributes)
    else
      NullCoffee.new
    end
  }
end

coffees.map(&:class)
# => [Coffee, Coffee, NullCoffee]

def earliest_roast_date
  coffees.select(&:roast_date).sort_by(&:roast_date).first
end

earliest_roast_date
# => #<struct Coffee
#     name="Tanzania AAA",
#     roast_date="2016-09-01",
#     flavour="very nice",
#     metadata={:altitude=>1600, :q_score=>83.2}>

def names_match?(pattern:)
  coffees.select { |coffee| coffee.name =~ pattern }
end

names_match?(pattern: /AAA/)
# => [#<struct Coffee
#      name="Tanzania AAA",
#      roast_date="2016-09-01",
#      flavour="very nice",
#      metadata={:altitude=>1600, :q_score=>83.2}>,
#     #<struct Coffee
#      name="Brazil AAA",
#      roast_date="2016-09-02",
#      flavour="chocolate and broken dreams",
#      metadata={:altitude=>1500, :q_score=>88.1}>]

In the above example, using NullCoffee objects instead of nil itself allows us to write code that works using the basic principle of Duck Typing.

coffees.select(&:roast_date).sort_by(&:roast_date).first could be read as “select all the coffees that have a roast date, sort them and take the first one”. There’s no mention there of nil values or (crucially for Duck Typing) the type/class of each coffee (Coffee or NullCoffee). We haven’t banished null entirely though as NullCoffee.new.roast_date returns nil (that’s what the call to select relies on to filter for coffees with valid roasting dates), we’ve just prevented calling our object’s methods on a nil value.

Banishing null using a type system

In statically typed programming languages, types can be used to distinguish between what would be null and what would be a value. Rust is a programming language that uses types to avoid nulls and it has two main optional types: Option and Result.

An Option type in Rust can be either Some(value) or None (the closest thing Rust has to a null value). Let’s try this type out:

fn operation_that_could_fail1(_in: String) -> Option<String> {
    Some("that worked".to_string())
}

fn operation_that_could_fail2(_in: String) -> Option<String> {
    None
}

fn main() {
    let result = operation_that_could_fail1("input".to_string())
        .and_then(|val| operation_that_could_fail2(val));

    if let Some(value) = result {
        println!("We got '{}' back", value);
    } else {
        println!("Nothing came back, oh dear.");
    }
}

The idea behind an optional type is that it can either contain a value or can be None. This isn’t conceptually all that different from null at first glance, until you realise that code such as let a: i32 = 10 does not allow that variable to ever hold a null-like value - it can only ever be a 32-bit-wide integer, anything else is a compile-time error.

If the if let Some(value) = result code above seemed a bit verbose, then Rust’s Option implementation includes a method unwrap for situations where you might want to quickly write some code and write more defensive error checking later:

fn main() {
    let first: Option<&str> = Some("ada");

    println!("first name is {}", first.unwrap());

    let last: Option<&str> = None;

    println!("name is {}", last.unwrap());

    println!("This line will never be reached");
}

The preceding example will panic on the line that does last.unwrap() because last is a None value (the line that prints the first name will work without issue however). This means you can write quite simple code without checking for None values, which will crash when something goes wrong.

What about default values? Option provides a method called unwrap_or which allows you to provide a default value in case the value is None:

fn main() {
    let name: Option<&str> = None;

    println!("Name is: {}", name.unwrap_or("Unknown"));
}

The preceding snippet will fail to find a name, but the default value will be returned instead, it will print out “Name is: Unknown”.

Silent failures

Fundamentally, all this treatment of nulls comes down to how we deal with failures. In certain circumstances, a null value doesn’t indicate a failure and sometimes it does. Picking the right tools and code patterns to match can be quite challenging. Rust deals with this situation by providing the Result type. A Result is either Ok(T) or Err(Err) - this is very similar to Option types except that errors have data associated with them. These error types allow the cause of an error to be provided to the function’s caller. However, the same tricks can be performed on Result types, including chaining a pipeline of functions together so that data gets passed from function to function (short-circuiting out upon the first error):

fn main() {
    let numbers = "4".parse::<u32>().and_then(|num| {
        "hello".parse::<u32>().map(|another| num + another)
    });

    match numbers {
        Ok(n) => {
            println!("That went well, the result is: {}", n);
        }
        Err(e) => {
            println!("Oh dear! Something bad happened: {:?}", e);
        }
    }
}

In the above example, when the string "hello" fails to parse as a number, the entire expression returns an Err value, short-circuiting the call to map which would otherwise add the 2 numbers together. This example is quite small and contrived, but hopefully the pattern is clear. The way Option and Result types can be chained together is quite similar, but if you need to know the cause of a failure (such as a number failing to parse), then Result is useful.

Conclusion

The situation with null is a tricky one. For many programming languages, both static and dynamic, any variable can be null and they require extra care when writing code. For those statically typed languages that don’t provide null values, we are always encouraged by the compiler to handle all eventualities. The end result of strict and explicit treatment of None or Error values is that we never see simple programming errors related to unexpected null in production code.

A tool that prevents whole classes of errors that can occur in production is something we should all be seriously considering and weighing up with other factors around their use. This is why the Elm programming language makes such a big deal of the fact that code written in Elm never throws exceptions in production code. Compared to the runtime experience with Javascript, that’s a very compelling advantage.

The story on the backend is not as straightforward - I’ve illustrated the advantages to static languages without null using Rust and although there are other options here (Haskell, Swift, Scala, etc.) for one reason or another these languages are not in widespread use yet (for backend services). Up until recently, if you wanted a good static programming language for your backend service, your choices were limited and involved seemingly overwhelmingly academic concerns. For better or worse, choosing a programming language for a system often has less to do with the semantics of the language itself and more to do with the tooling, developer experience (a highly subjective criteria) and community (involving everything from the available libraries and frameworks, to the inclusiveness and friendliness of prominent programmers and their approach to engaging newcomers). For all of those criteria, I see Rust as an excellent future option for teams that value stability, speed and excellent tooling. Currently, the ecosystem is young and there isn’t a clear set of mature libraries that teams could use to build backend services in the same fashion as they can currently with Ruby or Javascript, but it is definitely a technology to keep an eye on.

Will Roe's blog

Anaλysis Paraλysis » Archives

November 2, 2016

Avoiding null for an easier life