Ticket #4 (new defect)

Opened 1 year ago

Last modified 1 year ago

improve CopiedFailure processing

Reported by: warner Assigned to:
Priority: major Milestone: undecided
Component: unknown Version: 0.1.4
Keywords: Cc: strank

Description

I've been inspecting CopiedFailure? more closely today, while working on fixes to #2.

For reference, when an exception occurs as a result of a message being delivered, the remote_foo() method results in a Failure, which is a Twisted object that wraps an exception. The Failure is then serialized to deliver over the wire, and is reconstituted into a CopiedFailure? on the calling end.

The general problem is that a CopiedFailure? is not quite the same as a real Failure. It doesn't have any stack frames, for one, but we accomodate this by overriding getTraceback().

The main problem is that the .type attribute is a string rather than a reference to a real exception type like exceptions.IndexError? . The only places where this gets used is in Failure.check and Failure.trap, which are utility methods to allow error-processing chains make decisions about the kind of failure that occurred. Both compare self.type against some types passed in by the user.

The comparison itself works fine, since Failure.check is thoughtful enough to stringify the candidates before comparing them to self.type . The actual problem is in Failure.trap(), because it is specified to re-raise the original exception if it doesn't match the candidates. This is usually called inside a deferred callback, which means that the enclosing Deferred is going to wrap a brand new Failure around the exception just raised.

The Failure class is clever enough to notice that it is wrapping another Failure, and then copy the dict out of the old one into itself. The problem is that it thus copies .type (which is a string instead of, say, IndexError?), but the class changes from CopiedFailure? to a normal Failure . This means that any methods we've overridden in CopiedFailure? (specifically to accomodate the fact that CopiedFailures? aren't the same as regular Failures) go away.

The specific problem I witnessed was when, during the course of a unit test, a callRemote() triggered a remote exception, and then tried to use f.trap on the result, and the f.trap failed, and no other errback in the active Deferred chain caught the failure. The "unhandled error in deferred" threw the resulting Failure-with-string-.type into the error log, and eventually trial's error-scanner noticed it and tried to print it, using getTraceback(), which failed.

Attachments

foolscap-copiedremoteexception.patch (1.2 kB) - added by strank on 08/08/07 06:07:46.
foolscap-copiedremoteexception.hgexport.patch (1.2 kB) - added by strank on 04/16/08 08:20:17.
updated patch for current trunk

Change History

07/25/07 18:36:33 changed by warner

I'm trying out a fix for this: when constructing the CopiedFailure?, set .type as follows:

        assert isinstance(self.type, str)
        typepieces = self.type.split(".")
        class ExceptionLikeString:
            pass
        self.type = ExceptionLikeString
        self.type.__module__ = ".".join(typepieces[:-1])
        self.type.__name__ = typepieces[-1]

The resulting .type looks like a class, and reflect.qual() on it returns the same string as the original .type, so getTraceback() and friends should survive it. check() only pays attention to .parents(), which were strings in the first place, so that's unaffected. If .trap fires, the new Failure will have a .type that is identical.

The only drawback I can think of so far is that the new Failure does not have our overridden printTraceback() method, which means that when the traceback is printed, it will say "Traceback (failure with no frames)" even if a string traceback was provided by the original caller. To fix this, I think we'd need to spoof some .frames objects. Which might not be a bad idea, really, since it would reduce the amount of overridden code we've got elsewhere (and might let us get rid of our custom printTraceback()).

07/27/07 16:45:56 changed by warner

  • summary changed from fix CopiedFailure processing to improve CopiedFailure processing.

08/08/07 06:06:53 changed by strank

  • cc set to strank.

The current code for CopiedFailure? does not resolve the issue I reported at the twisted trac Ticket 2732 (http://twistedmatrix.com/trac/ticket/2732).

The problem remains that someone might reasonably do raise copiedfailure.type (or equivalent: generator.throw(copiedfailure.type) as is done in inlinecallbacks) and the resulting Exception would not be caught by except Exception: ... .

I propose a small change to the current code: Use an Exception class as the base class for the dynamically constructed failure type (see patch, the tests pass). This allows them to be caught more easily.

(I also saw the comment in the code tagged "MAYBE:" about reconstructing native exception types, which is what I did in the patch posted on the twisted trac. I am now not sure that this is better. Maybe use the known exception type as a second base class for the dynamically constructed Exception type?)

08/08/07 06:07:46 changed by strank

  • attachment foolscap-copiedremoteexception.patch added.

04/16/08 08:20:17 changed by strank

  • attachment foolscap-copiedremoteexception.hgexport.patch added.

updated patch for current trunk