Exceptions vs. error returns, quantified

Today I got into a discussion regarding embedded programmers and their tendency to avoid exceptions in C++ (even after moving to new-school “embedded” systems such as smartphones, game consoles, and set-top boxes). The argument basically boils down to three things: code size, memory usage, and execution speed. So, rather than continue on without actually putting my beliefs to the test (the discussion basically centered around whether engineers are making assumptions based on how things were 15+ years ago), I decided to construct a minimal test which I think shows where the tradeoffs may be.

Note: The timing and memory analysis has been updated as of May 24, 2018.

Okay, so, a pretty simple test case is our friend, recursive Fibonacci. It’s a pretty simple algorithm which is pretty straightforward for demonstrating the major issues. So, here are two versions; this first one is using trivial checked error returns: (fib-1.cpp)

/*! fib-1.cpp
 *  Fibonacci using return codes for error conditions
 */

#include <stdio.h>
#include <stdlib.h>
#include <algorithm>

static unsigned int* stack = NULL;

bool fibonacci(unsigned int n, unsigned int& ret)
{
    if (n <= 1) {
        ret = 1;
        return true;
    }

    unsigned int a, b;
    stack = std::min(stack, &a);

    bool valid = fibonacci(n - 1, a);
    if (!valid)
    { return false; }

    valid = fibonacci(n - 2, b);
    if (!valid)
    { return false; }

    ret = a + b;
    if (ret < a || ret < b)
    { return false; }

    return true;
}

int main(int argc, char* argv[])
{
    if (argc != 2) {
        fprintf(stderr, "Usage: %s n\n", argv[0]);
        return 1;
    }

    unsigned int val;
    unsigned int* sstack;
    stack = &val;
    sstack = stack;
    printf("Starting stack: %p\n", stack);

    unsigned int n = atoi(argv[1]);
    bool rv = fibonacci(n, val);

    printf("Ending stack: %p (delta=%ld)\n", stack, stack - sstack);


    if (rv) {
        printf("f(%d) = %u\n", n, val);
        return 0;
    }

    printf("f(%d): overflow\n", n);
    return 1;
}

and here is using exceptions: (fib-2.cpp)

/*! fib-2.cpp
 *  Fibonacci using exceptions for error conditions
 */

#include <stdio.h>
#include <stdlib.h>
#include <exception>
#include <algorithm>

static unsigned int* stack = NULL;

unsigned int fibonacci(unsigned int n)
{
    if (n <= 1) {
        return 1;
    }

    unsigned int a, b;
    stack = std::min(stack, &a);

    a = fibonacci(n - 1);
    b = fibonacci(n - 2);

    unsigned int ret = a + b;
    if (ret < a || ret < b)    {
        throw std::overflow_error("overflow");
    }

    return ret;
}

int main(int argc, char* argv[])
{
    if (argc != 2) {
        fprintf(stderr, "Usage: %s n\n", argv[0]);
        return 1;
    }

    unsigned int n = atoi(argv[1]);
    stack = &n;
    unsigned int* sstack = stack;

    printf("Starting stack: 0x%x\n", (unsigned int)stack);

    try {
        unsigned int val = fibonacci(n);
        printf("f(%d) = %u\n", n, val);
    } catch (std::exception& e) {
        fprintf(stderr, "f(%d): %s\n", n, e.what());
        return 1;
    }

    printf("Ending stack: %p (delta=%d)\n", (unsigned int)stack, stack - sstack);

    return 0;
}

(The error condition in this case is whether the operation overflowed.) Both of them use the same mechanism for determining how large the stack has grown, and the error checking version is based on how it works in real code that’s based on error checking.

(Note: I realize that the stack value isn’t going to be exactly precise in terms of how much is used, but it is precise in showing how much it grows based on the depth of the call tree. That’s all I care about here.)

(Also, to stave off any criticisms about the error-checking version, while yes, this could have used 0 to indicate error, that’s meaningless in the context of most applications; 0 could be a valid result value. In real applications, you end up always using the errorType function(input, input, input, output&) method if you want to keep things consistent.)

Code size

In terms of human-readable code size, the exceptions version is somewhat shorter, with 55 lines instead of 62. That isn’t very much for a single function, but across an entire codebase it can add up quickly.

I compiled both versions with clang-900.0.39.2 (on macOS 10.13) with g++ -O3 and stripped the executables. The error-return version weighs in at 8620 bytes, while the exception version is 9488 bytes - a difference of 868 bytes.

In order to determine how much of that is due to generated code and how much is due to the inclusion of <exception>, I modified fib-2.cpp to only throw an int instead of std::overflow_error, and then produced assembly dumps of all three versions. Stripping those assembly dumps of all labels (showing just the generated opcodes) I get the following sizes:

  • fib-1.s: 103 opcodes
  • fib-2.s: 117 opcodes
  • fib-2-throwint.s: 99 opcodes

which indicates that exceptions, in all likelihood, generate less code on a per-throw/catch site basis than the checked-error equivalent, with the initial increase in size due to per-exception-class overhead, which will be shared across every instance of an exception class.

Stack usage

So, the next thing to compare would be the amount of stack used. I ran each version from n=1 to 30 using seq 1 30 | xargs -n 1 ./fib-1. The error return version grew the stack exactly 12 ints for each additional n. The exception version grew the stack… exactly 12 words (48 bytes) for each additional n. The exception version ended up using slightly less stack overall, too, albeit with just a difference of one int.

Execution speed

This was a simple test running on my late-2017 iMac (3.8GHz core i5):

# time ./fib-1 45
Starting stack: 0x7ffeed8b07ec
Ending stack: 0x7ffeed8affa8 (delta=-529)
f(45) = 1836311903

real    0m7.590s
user    0m7.221s
sys 0m0.107s
# time ./fib-2 45
Starting stack: 0x7ffee5e407fc
f(45) = 1836311903
Ending stack: 0x7ffee5e3ffbc (delta=-528)

real    0m6.264s
user    0m5.947s
sys 0m0.107s

So despite the “common knowledge” that exceptions are “slow,” the exception-throwing version actually ran about 20% faster.

Conclusions

For code size and memory usage, it probably depends a lot on the complexity of the application, and a trivial application won’t do a very good job at capturing the difference between the two. It requires further study with a less-trivial example.

For execution time, in the (hopefully common) case that errors don’t occur, exceptions are faster. This makes sense; checked error returns require checking every result, whether it’s good or bad, while exceptions only have to do anything special when an error occurs. Now, given a sufficiently-complicated example (with lots of different exception types with a complex inheritance tree, for example), the error condition could very well take a lot of time to handle compared to simply checking an error return. In an environment where you’d expect there to be a lot of error conditions to handle (transitory failures, network hiccups, etc.) then error returns could very well turn out to be faster. But it seems that it’d be better to just use error returns for the circumstances which call for it, and to use exceptions everywhere else.

There is, of course, one major missing component to the memory use issue: the static data structures which have to be set up to begin with when the first try block starts. I don’t know of a quick, platform-independent way to measure that. I also don’t know what the impact is on a per-exception-type basis. However, those are essentially one-time costs, shared across every use of the exception type.

So, basically, my conclusion is that the only way you can be sure of which one works better for a given application is to write the application both ways, and see which one works better. However, my suspicion is that the small amount of memory overhead for exceptions is far outweighed by the time savings, reduced code complexity, and guarantee that errors are not simply ignored (unless they are explicitly ignored, and even ignoring exceptions means code was written to do so — and just putting a try{}catch(...){} at the top level will still do a lot more to “handle” an error than letting a return code drop on the floor).

Also, a larger, more meaningful comparison would do a better job of determining whether the generated code size (which is an issue for embedded systems! executables load entirely into memory on MMU-less systems) grows larger with exceptions or with error returns. I suspect that error returns will grow faster — there’s more code that has to go into more places, unless you absolutely need fine-grained error checking. If you’re wrapping a single line of code in a try/catch block, that would certainly be a good situation to use error returns in. But if you’re dealing with larger functional units where one of several things could go wrong and abort an entire process, exceptions are probably the way to go.

Anyway, the most important thing is to not simply assume things based on “common knowledge,” but to actually test them. People who have been working in a field for a long time tend to accumulate Ways of Doing Things and collections of assumptions based on how things were when they started. I used to assume exceptions were bad because that’s what older, more experienced programmers than myself told me. But change can be good, and just because someone isn’t as experienced in a given field doesn’t mean they don’t understand the issues. A fresh pair of eyes is always a good thing.

Comments

Before commenting, please read the comment policy.

Avatars provided via Libravatar