Ticket #105: failures.xhtml

File failures.xhtml, 9.2 KB (added by Brian Warner, 15 years ago)

propsed API for a failure_is_remote(f) predicate

Line 
1<html xmlns="http://www.w3.org/1999/xhtml">
2<head>
3<title>Foolscap Failure Reporting</title>
4<style src="stylesheet-unprocessed.css"></style>
5</head>
6
7<body>
8<h1>Foolscap Failure Reporting</h1>
9
10<h2>Signalling Remote Exceptions</h2>
11
12<p>The <code>remote_</code> -prefixed methods which Foolscap invokes, just
13like their local counterparts, can either return a value or raise an
14exception. Foolscap callers can use the normal Twisted conventions for
15handling asyncronous failures: <code>callRemote</code> returns a Deferred
16object, which will eventually either fire its callback function (if the
17remote method returned a normal value), or its errback function (if the
18remote method raised an exception).</p>
19
20<p>There are several other reasons that the Deferred returned
21by <code>callRemote</code> might fire its errback:</p>
22
23<ul>
24 <li>local outbound schema violation: the outbound method arguments did not
25     match the <code>RemoteInterface</code> that is in force. This is an
26     optional form of typechecking for remote calls, and is activated when
27     the remote object describes itself as conforming to a named
28     <code>RemoteInterface</code> which is also declared in a local class.
29     The local constraints are checked before the message is transmitted over
30     the wire. A constraint violation is indicated by
31     raising <code>foolscap.schema.Violation</code>, which is delivered
32     through the Deferred's errback.</li>
33 <li>network partition: if the underlying TCP connection is lost before the
34     response has been received, the Deferred will errback with
35     a <code>foolscap.ipb.DeadReferenceError</code> exception. Several things
36     can cause this: the remote process shutting down (intentionally or
37     otherwise), a network partition or timeout, or the local process
38     shutting down (<code>Tub.stopService</code> will terminate all
39     outstanding remote messages before shutdown).</li>
40 <li>remote inbound schema violation: as the serialized method arguments were
41     unpacked by the remote process, one of them violated that processes
42     inbound <code>RemoteInterface</code>. This check serves to protect each
43     process from incorrect types which might either confuse the subsequent
44     code or consume a lot of memory. These constraints are enforced as the
45     tokens are read off the wire, and are signalled with the
46     same <code>Violation</code> exception as above.</li>
47 <li>remote method exception: if the <code>remote_</code> method raises an
48     exception, or returns a Deferred which subsequently fires its errback,
49     the caller will see an errback which attempts to replicate the remote
50     exception. This errback will receive a <code>CopiedFailure</code>
51     instance, described below.</li>
52 <li>remote outbound schema violation: as the remote method's return value is
53     serialized and put on the wire, the values are compared against the
54     return-value constraint (if a <code>RemoteInterface</code> is in
55     effect). If it does not match the constraint, a Violation will be
56     raised.</li>
57 <li>local inbound schema violation: when the serialized return value arrives
58     on the original caller's side of the wire, the return-value constraint
59     of any effective <code>RemoteInterface</code> will be applied. This
60     protects the caller's response code from unexpected values. Any
61     mismatches will be signalled with a Violation exception.</li>
62</ul>
63
64
65<h2>CopiedFailures</h2>
66
67<p>Twisted uses the <code>twisted.python.failure.Failure</code> class to
68encapsulate Python exceptions in an instance which can be passed around,
69tested, and examined in an asynchronous fashion. It does this by copying much
70of the information out of the original exception context (including a stack
71trace and the exception instance itself) into the <code>Failure</code>
72instance. When an exception is raised during a Deferred callback function, it
73is converted into a Failure instance and passed to the next errback handler
74in the chain.</p>
75
76<p><code>RemoteReference.callRemote</code> uses the same convention: any
77exceptions that occur during the remote method call are delivered to the
78errback handler. However, several exceptions can occur on the remote process,
79and Failure objects contain references to local state which cannot be
80precisely replicated on a different system (stack frames and exception
81classes). So, when an exception happens on the remote side of
82a <code>callRemote</code> invocation, the errback handler will receive
83a <code>CopiedFailure</code> instance instead.</p>
84
85<p><code>CopiedFailure</code> is designed to behave very much like a
86regular <code>Failure</code> object. The <code>check</code>
87and <code>trap</code> methods work on <code>CopiedFailure</code>s just like
88they do on <code>Failure</code>s</p>
89
90<p>However, all of the Failure's attributes must be converted into strings
91for serialization. As a result, the original <code>.value</code> attribute
92(which contains the exception instance, which might contain additional
93information about the problem) is replaced by a stringified representation.
94The frames of the original stack trace are also replaced with a string, so
95they can be printed but not examined. The exception class is also passed as a
96string (using Twisted's <code>reflect.qual</code> fully-qualified-name
97utility), but <code>check</code> and <code>trap</code> both compare by string
98name instead of object equality, so most applications won't notice the
99difference.</p>
100
101<p>The default behavior of CopiedFailure is to include a string copy of the
102stack trace, generated with <code>printTraceback()</code>, which will include
103lines of source code when available. To reduce the amount of information sent
104over the wire, stack trace strings larger than about 2000 bytes are truncated
105in a fashion that tries to preserve the top and bottom of the stack.</p>
106
107<h3>unsafeTracebacks</h3>
108
109<p>Applications which consider their lines of source code or their
110exceptions' list of (filename, line number) tuples to be sensitive
111information can set the "unsafeTracebacks" flag in their Tub to False; the
112server will then remove stack information from the CopiedFailure objects it
113sends to other systems.</p>
114
115<pre class="python">
116t = Tub()
117t.unsafeTracebacks = False
118</pre>
119
120<p>When unsafeTracebacks is False, the <code>CopiedFailure</code> will only
121contain the stringified exception type, value, and parent class names.</p>
122
123<h2>Distinguishing Remote Exceptions</h2>
124
125<p>The original caller can tell the difference between exceptions that
126occurred locally and ones that occurred on the remote end. The most common
127use for this is to re-raise exceptions that resulted from programming errors
128in the local code, while cleanly handling or ignoring errors that were caused
129by the code at the remote end. The general idea is that remote code may be
130maliciously trying to confuse or subvert your program's control flow by
131returning unexpected exceptions, but that exceptions which occur locally (and
132are not otherwise caught and handled) are probably bugs which need to be made
133visible. The philosophy of how to best handle errors is beyond the scope of
134this document, but Foolscap tries to provide the tools to allow programmers
135to implement whatever approach they choose.</p>
136
137<p>It is useful to distinguish a remote exception from a local one,
138especially when the code involves multiple processing steps (some local, some
139remote). For example, the following snippet performs a local processing step,
140then asks a remote server for information, then adds that information into a
141local database. All three steps are asynchronous.</p>
142
143<pre class="python">
144def get_and_store_record(name):
145    d = local_db.getIDNumber(name)
146    d.addCallback(lambda idnum: rref.callRemote("get_record", idnum))
147    d.addCallback(lambda record: local_db.storeRecord(name))
148    return d
149</pre>
150
151<p>The caller of <code>get_and_store_record</code> might like to distinguish
152between a problem that occurred in <code>getIDNumber</code> from one that
153occurred during the remote call to <code>remote_get_record</code>.</p>
154
155<p>For each Foolscap event that can raise a remote exception described above
156(i.e. remote inbound schema Violation, remote method exception, remote
157outbound schema Violation), the original caller will receive
158a <code>CopiedFailure</code> instance. For Foolscap events that raise
159exceptions locally (local outbound schema Violation, local inbound schema
160Violation), the caller will receive a regular <code>Failure</code> instance.
161Any non-Foolscap exception events (i.e. the <code>getIDNumber</code>
162and <code>storeRecord</code> calls in the example above) will also get
163a <code>CopiedFailure</code>.</p>
164
165<p>Application code should use <code>foolscap.ipb.failure_is_remote()</code>
166to distinguish between local and remote failures. This returns True
167for <code>CopiedFailure</code> instances and False for
168regular <code>Failure</code>s. A future version of Foolscap may change the
169way <code>CopiedFailure</code> is used (ideally Failure and CopiedFailure
170should be the same class), but <code>failure_is_remote</code> will continue
171to work correctly.</p>
172
173<pre class="python">
174d = get_and_store_record("bob")
175def handle_remote_exception(f):
176    if not failure_is_remote(f):
177        return f
178    print "Remote caller failed:", f
179    print "no record stored"
180    return None
181d.addErrback(handle_remote_exception)
182</pre>
183