Foolscap Failure Reporting

Signalling Remote Exceptions

The remote_ -prefixed methods which Foolscap invokes, just like their local counterparts, can either return a value or raise an exception. Foolscap callers can use the normal Twisted conventions for handling asyncronous failures: callRemote returns a Deferred object, which will eventually either fire its callback function (if the remote method returned a normal value), or its errback function (if the remote method raised an exception).

There are several other reasons that the Deferred returned by callRemote might fire its errback:

local outbound schema violation: the outbound method arguments did not match the RemoteInterface that is in force. This is an optional form of typechecking for remote calls, and is activated when the remote object describes itself as conforming to a named RemoteInterface which is also declared in a local class. The local constraints are checked before the message is transmitted over the wire. A constraint violation is indicated by raising foolscap.schema.Violation, which is delivered through the Deferred's errback.
network partition: if the underlying TCP connection is lost before the response has been received, the Deferred will errback with a foolscap.ipb.DeadReferenceError exception. Several things can cause this: the remote process shutting down (intentionally or otherwise), a network partition or timeout, or the local process shutting down (Tub.stopService will terminate all outstanding remote messages before shutdown).
remote inbound schema violation: as the serialized method arguments were unpacked by the remote process, one of them violated that processes inbound RemoteInterface. This check serves to protect each process from incorrect types which might either confuse the subsequent code or consume a lot of memory. These constraints are enforced as the tokens are read off the wire, and are signalled with the same Violation exception as above.
remote method exception: if the remote_ method raises an exception, or returns a Deferred which subsequently fires its errback, the caller will see an errback which attempts to replicate the remote exception. This errback will receive a CopiedFailure instance, described below.
remote outbound schema violation: as the remote method's return value is serialized and put on the wire, the values are compared against the return-value constraint (if a RemoteInterface is in effect). If it does not match the constraint, a Violation will be raised.
local inbound schema violation: when the serialized return value arrives on the original caller's side of the wire, the return-value constraint of any effective RemoteInterface will be applied. This protects the caller's response code from unexpected values. Any mismatches will be signalled with a Violation exception.

CopiedFailures

Twisted uses the twisted.python.failure.Failure class to encapsulate Python exceptions in an instance which can be passed around, tested, and examined in an asynchronous fashion. It does this by copying much of the information out of the original exception context (including a stack trace and the exception instance itself) into the Failure instance. When an exception is raised during a Deferred callback function, it is converted into a Failure instance and passed to the next errback handler in the chain.

RemoteReference.callRemote uses the same convention: any exceptions that occur during the remote method call are delivered to the errback handler. However, several exceptions can occur on the remote process, and Failure objects contain references to local state which cannot be precisely replicated on a different system (stack frames and exception classes). So, when an exception happens on the remote side of a callRemote invocation, the errback handler will receive a CopiedFailure instance instead.

CopiedFailure is designed to behave very much like a regular Failure object. The check and trap methods work on CopiedFailures just like they do on Failures

However, all of the Failure's attributes must be converted into strings for serialization. As a result, the original .value attribute (which contains the exception instance, which might contain additional information about the problem) is replaced by a stringified representation. The frames of the original stack trace are also replaced with a string, so they can be printed but not examined. The exception class is also passed as a string (using Twisted's reflect.qual fully-qualified-name utility), but check and trap both compare by string name instead of object equality, so most applications won't notice the difference.

The default behavior of CopiedFailure is to include a string copy of the stack trace, generated with printTraceback(), which will include lines of source code when available. To reduce the amount of information sent over the wire, stack trace strings larger than about 2000 bytes are truncated in a fashion that tries to preserve the top and bottom of the stack.

unsafeTracebacks

Applications which consider their lines of source code or their exceptions' list of (filename, line number) tuples to be sensitive information can set the "unsafeTracebacks" flag in their Tub to False; the server will then remove stack information from the CopiedFailure objects it sends to other systems.

t = Tub()
t.unsafeTracebacks = False

When unsafeTracebacks is False, the CopiedFailure will only contain the stringified exception type, value, and parent class names.

Distinguishing Remote Exceptions

The original caller can tell the difference between exceptions that occurred locally and ones that occurred on the remote end. The most common use for this is to re-raise exceptions that resulted from programming errors in the local code, while cleanly handling or ignoring errors that were caused by the code at the remote end. The general idea is that remote code may be maliciously trying to confuse or subvert your program's control flow by returning unexpected exceptions, but that exceptions which occur locally (and are not otherwise caught and handled) are probably bugs which need to be made visible. The philosophy of how to best handle errors is beyond the scope of this document, but Foolscap tries to provide the tools to allow programmers to implement whatever approach they choose.

It is useful to distinguish a remote exception from a local one, especially when the code involves multiple processing steps (some local, some remote). For example, the following snippet performs a local processing step, then asks a remote server for information, then adds that information into a local database. All three steps are asynchronous.

def get_and_store_record(name):
    d = local_db.getIDNumber(name)
    d.addCallback(lambda idnum: rref.callRemote("get_record", idnum))
    d.addCallback(lambda record: local_db.storeRecord(name))
    return d

The caller of get_and_store_record might like to distinguish between a problem that occurred in getIDNumber from one that occurred during the remote call to remote_get_record.

For each Foolscap event that can raise a remote exception described above (i.e. remote inbound schema Violation, remote method exception, remote outbound schema Violation), the original caller will receive a CopiedFailure instance. For Foolscap events that raise exceptions locally (local outbound schema Violation, local inbound schema Violation), the caller will receive a regular Failure instance. Any non-Foolscap exception events (i.e. the getIDNumber and storeRecord calls in the example above) will also get a CopiedFailure.

Application code should use foolscap.ipb.failure_is_remote() to distinguish between local and remote failures. This returns True for CopiedFailure instances and False for regular Failures. A future version of Foolscap may change the way CopiedFailure is used (ideally Failure and CopiedFailure should be the same class), but failure_is_remote will continue to work correctly.

d = get_and_store_record("bob")
def handle_remote_exception(f):
    if not failure_is_remote(f):
        return f
    print "Remote caller failed:", f
    print "no record stored"
    return None
d.addErrback(handle_remote_exception)