Opened 17 years ago

Last modified 15 years ago

#4 new defect

fix CopiedFailure processing — at Initial Version

Reported by: Brian Warner Owned by:
Priority: major Milestone: undecided
Component: error-handling Version: 0.1.4
Keywords: Cc: strank

Description

I've been inspecting CopiedFailure? more closely today, while working on fixes to #2.

For reference, when an exception occurs as a result of a message being delivered, the remote_foo() method results in a Failure, which is a Twisted object that wraps an exception. The Failure is then serialized to deliver over the wire, and is reconstituted into a CopiedFailure? on the calling end.

The general problem is that a CopiedFailure? is not quite the same as a real Failure. It doesn't have any stack frames, for one, but we accomodate this by overriding getTraceback().

The main problem is that the .type attribute is a string rather than a reference to a real exception type like exceptions.IndexError? . The only places where this gets used is in Failure.check and Failure.trap, which are utility methods to allow error-processing chains make decisions about the kind of failure that occurred. Both compare self.type against some types passed in by the user.

The comparison itself works fine, since Failure.check is thoughtful enough to stringify the candidates before comparing them to self.type . The actual problem is in Failure.trap(), because it is specified to re-raise the original exception if it doesn't match the candidates. This is usually called inside a deferred callback, which means that the enclosing Deferred is going to wrap a brand new Failure around the exception just raised.

The Failure class is clever enough to notice that it is wrapping another Failure, and then copy the dict out of the old one into itself. The problem is that it thus copies .type (which is a string instead of, say, IndexError?), but the class changes from CopiedFailure? to a normal Failure . This means that any methods we've overridden in CopiedFailure? (specifically to accomodate the fact that CopiedFailures? aren't the same as regular Failures) go away.

The specific problem I witnessed was when, during the course of a unit test, a callRemote() triggered a remote exception, and then tried to use f.trap on the result, and the f.trap failed, and no other errback in the active Deferred chain caught the failure. The "unhandled error in deferred" threw the resulting Failure-with-string-.type into the error log, and eventually trial's error-scanner noticed it and tried to print it, using getTraceback(), which failed.

Change History (0)

Note: See TracTickets for help on using tickets.