Monday, July 11, 2016

Getting rid of the billion dollar mistake – null references

Null reference exceptions are one the most common errors that programmers make while writing code. Compilers cannot check this errors and will only happen at runtime. Null reference exceptions is thrown when an attempt to access an object is made in a code, and the reference to that object is null. To avoid this exception, before using this object in the code a check to verify whether the object is not null has to be performed.
For e.g. the Welcome method in the below code, throws a null reference exception if the Identity object is null.

public class User
{
    public IIdentity Identity { get; private set; }

    public User(IIdentity identity)
    {
        Identity = identity;
    }
    public string Welcome()
    {
        return $"Hello {Identity.Name}";
    }
}

To avoid null reference exceptions, it’s a common practice to create guarded statements for the logic that executes on a possible null instance. Every single statement which operates on a potential null object becomes an if-then-else statement, which results in creation of multiple execution paths in the code, resulting in increased code complexity and reduces testability.

public class User
{
    public IIdentity Identity { get; private set; }

    public User(IIdentity identity)
    {
        Identity = identity;
    }
    public string Welcome()
    {
        if(Identity == null)
        {
            return string.Empty;
        }
        return $"Hello {Identity.Name}";
    }
}

How to get rid of the null checks?

Null object pattern


A null object is used to encapsulate the absence of an object by providing a dummy alternative that does nothing when a method on the object is invoked.  To create a null object implementation, we create an abstraction specifying various operations to be done, concrete classes extending this class and a null object class providing do nothing implementation of this class. Instead of passing a null instance to the higher methods, the null object instance will be used to provide a nothing implementation or default implementation to avoid guarded null checks in the subsequent methods. Our previous sample code can be refactored as below to avoid null checks. To create an abstraction we can use the extract interface refactoring as given below.

public interface IUser
{
    IIdentity Identity { get; }

    string Welcome();
}

The new implementation for the User class looks like.

public class User : IUser
{
    public IIdentity Identity { get; private set; }

    public User(IIdentity identity)
    {
        Identity = identity;
    }
    public string Welcome()
    {
        return $"Hello {Identity.Name}";
    }
}

The null object implementation for the IUser interface as can be written as

public sealed class NullUser : IUser
{
    public IIdentity Identity
    {
        get
        {
            return new NullIdentity();
        }
    }

    public string Welcome()
    {
        return string.Empty;
    }
}

public sealed class NullIdentity : IIdentity
{
    public string AuthenticationType
    {
        get { return string.Empty; }
    }

    public bool IsAuthenticated
    {
        get { return false; }
    }

    public string Name
    {
        get { return string.Empty; }
    }
}

This new structure can be used in the code, that does not need any guarded statements. We’ll create a UserFactory sample code that will now return a null object implementation and later consume in a test method, that does not need an if-then-else statement for null checks.

public static class UserFactory
{
    public static IUser Create(string name)
    {
        if(name == "Admin")
        {
            return new User(new GenericIdentity(name));
        }
        return new NullUser();
    }
}

[TestMethod]
public void NullObjectImplementationDoesNotThrowExceptionsOnNullReferences()
{
    var user = UserFactory.Create("Dummy");
    var actual = user.Welcome(); // No need to check the null instance here
    Assert.IsTrue(string.IsNullOrEmpty(actual));
}

Maybe pattern

The null object pattern is useful in situations where the caller does not need to take any actions based on the type of object returned. When using the null object pattern we would treat the result of calling the IUser implementation the same regardless of whether we get a real User or not. If we want to explicitly let the caller decide whether or not they need to check for a null value or not, we need to create a way to know whether the object is a null or not. The Maybe pattern can be used in this scenario.

We can create a generic implementation of the Maybe pattern using the Maybe class in C# as
                                                                                                                                             
public class Maybe<T> where T : class
{
    public T Value { get; private set; }

    public Maybe(T value)
    {
        Value = value;
    }

    Maybe() { }

    public static Maybe<T> Default
    {
        get
        {
            return new Maybe<T>();
        }
    }

    public bool HasValue
    {
        get
        {
            return Value != default(T);
        }
    }
}

To use this implementation in our user factory we can change the code like.

public static class UserFactory
{
    public static Maybe<User> Create(string name)
    {
        if (name == "Admin")
        {
            return new Maybe<User>(new User(new GenericIdentity(name))) ;
        }
        return Maybe<User>.Default;
    }
}

Compared to the previous implementation, the Maybe object denotes the caller that there is an ‘Option’ of the object exposed via the value property being null.  It’s the responsibility of the caller to make sure that the HasValue property is checked before performing operations on the object.
Coupled with some extension methods and delegates, you can now perform different operations without using the if-then-else statements as given below.

public static K Execute<T, K>(this Maybe<T> instance, Func<K> action, Func<K> emptyAction) where T : class
{
    if (instance.HasValue)
    {
        return action.Invoke();
    }
    return emptyAction.Invoke();
}

[TestMethod]
public void ExecuteInvokesTheEmptyExecutionDelegateWhenMaybeObjectDoesNotHaveAValue()
{
    var actual = UserFactory
        .Create(string.Empty)
        .Execute(NonEmptyFunc, EmptyFunc, 10);

    Assert.AreEqual(actual, 9);
}

Where EmptyFunc and NonEmptyFunc are simple functions written as

int NonEmptyFunc(int value)
{
    return ++value;
}

int EmptyFunc(int value)
{
    return --value;
}

Summary

As you can see from the above examples, both these approaches reduces branching in code and prevents you from dealing with null objects. Based on the requirements of the caller whether to take actions based on the state of the object or not, you can decide whether to use a NullObject or Maybe implementation in the code.  

Quoted from:

"I call it my billion-dollar mistake." - Sir C. A. R. Hoare, on his invention of the null reference

No comments: