Opened 14 years ago
Last modified 14 years ago
#186 new defect
Remote tracebacks often elide important lines
| Reported by: | davidsarah | Owned by: | |
|---|---|---|---|
| Priority: | major | Milestone: | undecided |
| Component: | error-handling | Version: | 0.6.1 |
| Keywords: | traceback elided | Cc: |
Description
Remote exception tracebacks often have -- TRACEBACK ELIDED -- replacing the lines that would be most useful for debugging. I think this is a consequence of the default stringification of Twisted Failures. In foolscap/tokens.py@b9bc4a4#L91, we have
def __str__(self):
return "<RemoteException around '%s'>" % str(self.failure)
which should probably instead be using the traceback module to get a more complete traceback from self.failure.value.
Change History (4)
comment:1 Changed 14 years ago by
comment:2 Changed 14 years ago by
Oh, it's just this code in FailureSlicer.getStateToCopy:
if len(state['traceback']) > 1900:
state['traceback'] = (state['traceback'][:700] +
"\n\n-- TRACEBACK ELIDED --\n\n"
+ state['traceback'][-1200:])
I commented that out and got a traceback that was unmutilated ;-)
I think 1900 bytes is way too short.
comment:3 follow-up: 4 Changed 14 years ago by
Seems reasonable. The 2k limitation was 1: to limit DoS sorts of attacks (and
in #127 I decided to give up on resource-consumption-attack defenses), and 2:
to keep the important stuff visible (to the eventual log-reading human trying
to debug the problem) despite tracebacks that include overly-deep stacks
(recursion gone wrong and running up against sys.setrecursionlimit. I
also remember having problems with printable forms that include large
(usually binary) strings, like tahoe shares or blocks, where the actual
contents weren't interesting but were obscuring the useful metadata.
I was expecting the traceback to only contain filenames, line numbers, and
code snippets. The value of the exception might contain larger strings,
as created by a __repr__ on the Exception class. Perhaps there's a
bug/misdesign such that the value is being delivered as part of the traceback
instead of as a separate attribute? If we fixed that, and delivered
repr(f.value) in its own separately-size-limited field, then would a
1900-byte stack trace be sufficient?
comment:4 Changed 14 years ago by
Replying to warner:
I was expecting the traceback to only contain filenames, line numbers, and code snippets. The value of the exception might contain larger strings, as created by a
__repr__on the Exception class.
In the specific case that prompted me to file this ticket, the __repr__ was a few thousand bytes/characters, and I needed both the elided stack trace and the elided part of the __repr__. I've also run into this a couple of times before, and my memory of those cases isn't sufficient to recall whether the exception __repr__ was unusually large.
A conversation about one of them was pasted at http://codepad.org/eXchc6Xz:
<davidsarah> josipl: unfortunately, the -- TRACEBACK ELIDED -- in the original traceback (http://gist.github.com/411241) would have been the interesting bit <davidsarah> that's what we're trying to get at
Looking at that gist, the interesting part of the stack trace was a slightly deeper than usual but not especially so, and the Exception __repr__ was small.
Perhaps there's a bug/misdesign such that the value is being delivered as part of the traceback instead of as a separate attribute? If we fixed that, and delivered
repr(f.value)in its own separately-size-limited field, then would a 1900-byte stack trace be sufficient?
Judging by the second example above, no.

Actually it's more complicated than the description says, the case I'm interested in occurs when the RemoteException is around a CopiedFailure. I think that whatever sets the
'traceback'element of CopiedFailure's state is at fault; still investigating.Here's an example of a mutilated error report (edited for long lines):
<class 'allmydata.mutable.common.NotEnoughServersError'>: ('Publish ran out of good servers, last failure was: [Failure instance: Traceback (failure with no frames): <class \'foolscap.tokens.RemoteException\'>: <@RemoteException around \'[@CopiedFailure instance: @Traceback from remote host -- Traceback (most recent call last):\n File "/usr/local/lib/python2.6/dist-packages/Twisted-12.0.0-py2.6-linux-i686.egg/twisted/internet/tcp.py", line 277, in connectionLost\n protocol.connectionLost(reason)\n File "/usr/local/lib/python2.6/dist-packages/Twisted-12.0.0-py2.6-linux-i686.egg/twisted/web/client.py", line 191, in connectionLost\n self.factory._disconnectedDeferred.callback(None)\n File "/usr/local/lib/python2.6/dist-packages/Twisted-12.0.0-py2.6-linux-i686.egg/twisted/internet/defer.py", line 368, in callback\n self._startRunCallbacks(result)\n File "/usr/local/lib/python2.6/dist-packages/Twisted-12.0.0-py2.6-linux-i686.egg/twisted/internet/defer.py", line 464, in _startRunC\n \n-- TRACEBACK ELIDED --\n\n7 6f 61 76 6a 7a 6b 76 66 2f</StringToSignBytes> [AWS credentials redacted by David-Sarah]</Error>\')\n]\'>\n]', None)(The '@' characters are where I edited the local foolscap to identify which parts were coming from that code.)