Abusing bang methods in Ruby for fun and profit

Ruby has a quasi-convention of taking a method, slightly modifying its behavior, and denoting it with a trailing exclaimation mark, e.g. #select becomes #select!. Sometimes these methods raise an error, modify the receiver in place, or return nil on no-op, or do some combination of these, or something else entirely. Such methods are collectively referred to as “bang” methods, and there’s no hard-and-fast rule for what makes them special—just that they are “more dangerous” than their non-bang counterparts.

Today I’d like to write about bang methods that modify their receivers in place and return nil on no-op, and how you can improve performance in Ruby applications by (ab)using them to reduce object churn in hot paths.

Object#tap

Let’s start with String#strip, which removes leading and trailing whitespace from a copy of the receiver (in other words, it doesn’t mutate the original string). Let’s say you have a method that does some work on a string and returns it stripped:

def parse(input)
    output = input.dup
    # some work
    output.strip
end

You recently profiled your application and determined that parse is slowing things down, partly because the String#strip at the end creates an extra object (a superfluous copy of output) on every invocation. You’ve already diligently copied your input string at the top of the method body, and now you’ve determined that it’s safe and prudent to simply modify output in-place using the bang version of the same method. Here’s your updated code:

def parse(input)
    output = input.dup
    # some work
    output.strip!
end

Ah, but now you’ve introduced a bug: String#strip! will return nil if there’s no leading or trailing whitespace to remove from the string. The obvious solution is to strip the string and return the string in two separate steps:

def parse(input)
    output = input.dup
    # some work
    output.strip!
    output
end

Here’s where we can start to be a little clever and use Object#tap to tighten up our code. This method, common to all objects, yields the receiver to the block and returns the receiver (not the result of the block). It’s usually mentioned as a debugging aid, but we can use it here, in production code, to apply our efficient strip function while preserving a proper, non-nil return value:

def parse(input)
    output = input.dup
    # some work
    output.tap { |s| s.strip! }
end

Or, more succinctly:

def parse(input)
    output = input.dup
    # some work
    output.tap(&:strip!)
end

It’s a simple thing, and it works great for operations on arrays and other types of objects as well. It’s particularly powerful when you have a long method chain and you’d like to swap out one method for its bang counterpart. For example, to tweak this expression to modify arr in-place (and not create any extra objects):

arr.sort.uniq.each { |e| .. }

You can write:

arr.sort!.tap(&:uniq!).each { |e| .. }

N.B. Array#sort! will always return the receiver, so we don’t have to worry about an unexpected nil return value.

Array#each

Now consider an expression that performs a similar operation on a collection of values, such as an array of strings:

strings.map(&:strip)

In this example, Array#map will create a copy of the array, and String#strip will create a copy of each string in the array. That’s n + 1 extra objects (assuming we aren’t prohibited from modifying the originals). If we’ve determined that it’s safe and prudent to modify the array and its contents in-place, we can rewrite it like this, using the Object#tap technique described above:

strings.map! { |s| s.tap(&:strip!) }

N.B. Array#map! with a block will always return the modified receiver, so we don’t have to worry about an unexpected nil return value.

Unfortunately, as efficient as this might be, it’s somewhat contorted and difficult to grok at a glance. If there are other methods in the chain, this approach might result in a significant downgrade in code readability.

There’s a much more elegant way to write our desired expression. Array#each with a block (or a &:sym) will always return the receiver (not the result of the block)! We can handily craft our modify-in-place operation like so:

strings.each { |s| s.strip! }

Or, more succinctly:

strings.each(&:strip!)

Consider an array of arrays of integers. We have logic to find the average of the unique integers in each array:

arr.map(&:uniq).map { |a| a.reduce(:+).fdiv(a.length) }

…but we’ve since decided to perform these operations in-place. Easy!

arr.each(&:uniq!).map! { |a| a.reduce(:+).fdiv(a.length) }

Hash#each_value, etc.

This technique also works with other methods that return the receiver. Unfortunately, I don’t have an exhaustive list at this time.

For example, if you have a hash of arrays, and you want to remove nil values from each array, returning an updated hash, you might write:

hsh.map { |k, v| [k, v.compact] }.to_h

If the hash and its arrays are mutable, this could be rewritten as:

hsh.each_value(&:compact!)

Final Notes

It’s important to keep in mind that you shouldn’t mix and match paradigms. Object#tap and Array#each with non-bang methods are going to create copies, do the work, throw it all away, and return the original, unmodified receiver(s).

Here’s a cheatsheet for objects:

Receiver… Return value # objects created
#meth Copied and modified Modified copy 1
#meth! Modified in-place Receiver, or nil if unchanged 0
#tap(&:meth) Copied and modified Original, unmodified receiver (oops!) 1
#tap(&:meth!) Modified in-place Receiver 0

Tip: Most of the time, you’ll want either #meth (immutable) or #tap(&:meth!) (mutable).

And a cheatsheet for array contents:

Contents… Return value (array) Return values (contents) # objects created
#map(&:meth) Copied and modified Modified copy Modified copies n + 1
#map(&:meth!) Modified in-place Modified copy Receivers, or nil values if unchanged 1
#map!(&:meth) Copied and modified Receiver, or nil if unchanged Modified copies n
#map!(&:meth!) Modified in-place Receiver, or nil if unchanged Receivers, or nil values if unchanged 0
#each(&:meth) Copied and modified Original, unmodified receiver (oops!) Original, unmodified receivers (double oops!) n
#each(&:meth!) Modified in-place Receiver Receivers 0

Tip: Most of the time, you’ll want either #map(&:meth) (immutable) or #each(&:meth!) (mutable).

Lonely Operator

Some folks will be quick to read this and say, “Ah-hah! Ruby 2.3+’s lonely operator—a.k.a. the safe navigation operator—solves this problem, too!” However, it does not. Consider the case when you want an expression to return the count of unique items in an array:

arr.uniq.size

Now you want to modify the expression to modify the array in-place:

arr.uniq!.size

But Array#uniq! can return nil, and NilClass#size will raise a NoMethodError, so you employ the lonely operator:

arr.uniq!&.size

This does prevent a NoMethodError, but now the expression can return nil, which is probably not what you want. You really want to do this:

arr.uniq!
arr.size

Or this:

arr.tap(&:uniq!).size

Conclusion

I hope you find these techniques useful!

Leave a Reply

Your email address will not be published. Required fields are marked *