An improved error-handling package for Go

Today we’re open-sourcing github.com/yext/yerrors, a tiny Go error-handling library that is a drop-in replacement for golang.org/x/xerrors. It augments xerrors with Wrap and Mask functions that add to the stack trace without adding to the error message.

Read on for the full story.

Error handling in Go, a brief history

We started using Go in 2014 to build the Yext Pages system (see The Making of Yext Pages), and some of the decisions made at that time are showing their age.

One of those decisions was to go all-in on the github.com/juju/errgo library for error handling. It was written by celebrity Gopher @rogpeppe and released in 2014 along with some suggested rules for generating good errors in go.

Go, in contrast to Java, treats errors as values, with no special language support. As a result, if you’d like stack traces associated with your errors or the ability to determine if a given error is of a certain type, or caused by an error of that type (e.g. a deadline being exceeded), then it has to be implemented in a library and used consistently across your codebase.

However, errgo never reached critical mass, and the community eventually consolidated around github.com/pkg/errors, which sported a simpler API and was written by another celebrity Gopher @davecheney.

Then in 2018, the Go team worked on a major upgrade to error handling, outlined by the Error Values Draft Design, and they implemented the golang.org/x/xerrors package, with the intention to include it in the standard library. It has a simple-but-powerful API and promises to support interopability across the whole ecosystem.

A high-quality official solution at last! If you aren’t familiar with the API, Working with Errors in Go 1.13 is the best guide.

However, there is one big wrinkle: Go 1.13 included only half of the xerrors package. It includes matching and unwrapping (essential for interoperability) but lacks support for stack traces.

There are currently no plans to incorporate the whole xerrors package into the standard.

Present day

Yext has about 500k lines of Go code in a monorepo, and a sizable percentage of that involved error handling with errgo.

We were motivated to make a change for a couple reasons:

  • Get onto a standard error handling library instead of our idiosyncratic one.

  • Much-improved ergonomics should lead to fewer bugs in error-handling code. For example, errgo’s distinction between a “cause” and an “underlying” error is easy to confuse.

  • It is more likely to seamlessly interoperate with third-party libraries. For example, Sentry recently added support for storing stack traces associated with xerrors in PR #246, and go.uber.org/multierr added support according to a golang-nuts discussion thread.

We find stack trace information invaluable; golang.org/x/xerrors has a great API and was the closest thing to a standard, so we chose to migrate to it.

Performing the migration

We decided to perform a flag-day migration of the Go codebase from errgo to xerrors, using automated tools to accomplish the vast majority of the changes:

  • eg to perform the code rewriting.

  • Gazelle to update BUILD files.

Example-based code rewriting seemed like a great fit for this work. I had to make widespread changes like this:

errgo.New("a")               -> xerrors.New("a")
errgo.Newf("%s", a)          -> xerrors.Errorf("%s", a)
errgo.Mask(err)              -> xerrors.Mask(err)
errgo.Notef(err, "%s", a)    -> xerrors.Errorf("%s: %w", a, err)
errgo.NoteMask(err, "%s", a) -> xerrors.Errorf("%s: %v", a, err)

An eg template looks like this:

func before(err error, str string, arg interface{}) error {
	return errgo.Notef(err, str, arg)
}

func after(err error, str string, arg interface{}) error {
	return xerrors.Errorf("%s: %w", str, arg, err)
}

Reality is always a bit messier; in this case, eg does not seamlessly support variadic arguments, so I had to create some additional rules for each size of argument list that appeared in our codebase. But overall, this worked well. The main benefit is that it allowed me to iteratively test the set of migration rules against the contents of HEAD, without ever having to merge new commits into my working copy. Since it’s automated, I could update to the latest revision and re-apply the refactoring in seconds.

Some custom error types and bespoke identity testing code had to be updated manually, but the combination of Go & monorepo enables us to build & (unit) test the whole codebase quickly. Despite a change that touched tens of thousands of lines, it required just a couple days of work to develop. Even better, we could be confident in the correctness, and there were no reported issues from the migration.

Not quite a wrap

One issue did arise during migration. Our codebase makes liberal use of errgo.Mask (similar to the better-known github.com/pkg/errors.WithStack ). It records the stack frame without adding context to the error message.

We planned to add a func Wrap(error) error to perform this function, but it turns out that this is not possible when using the xerrors package.

Why?? How can this be?

Philosophy of errors

The xerrors library requires developers to add context to an error to add a stack frame to the trace, like this:

return xerrors.Errorf("looking up username %q: %w", username, err)

The alternative is to not wrap the error at all, which results in a stack trace that omits this location. As a codebase grows in complexity and functionality is factored into more pieces, more and more function calls do not merit inclusion in a human-readable error message, and the number of omitted stack frames grows.

The xerrors philosophy says that is working as intended. We should only record “interesting” stack frames, which have accompanying context, and other frames are not valuable when debugging.

The resulting stack traces are more concise. Although some stack frames are missing, comprehension is enhanced by eliding unhelpful details. This approach contrasts with that of Java, well known for legendary stack traces that require an eagle eye to pick out the small percentage of interesting frames.

Giant Java stack trace

My impression is that the approach works well, having used xerrors on some isolated projects.

Resolution

Regardless of philosophy, I wasn’t comfortable with dropping all of the calls to errgo.Mask in our codebase. I needed a replacement that allowed me to add a stack frame without touching the error message.

Concretely, we would like the following behavior for no-context errors:

  1. fmt.Sprintf("%v", err) or err.Error() should return the simple message, eliding context-less wrappers.

  2. fmt.Sprintf("%+v", err) should show all stack frames, plus the message if present.

This turns out to be impossible without tweaking the errors returned by xerrors.Errorf.

Today we’re open-sourcing a small Go error-handling library that does exactly that. github.com/yext/yerrors sports the same API as xerrors, and it sports the addition of Wrap and Mask functionality.

Reflection

The combination of automated refactoring tools and a monorepo enabled a giant refactoring to proceed quickly and without incident.