Opened 16 years ago
Last modified 15 years ago
#101 new defect
traceback from remote host doesn't come with identifying information about which remote host
Reported by: | Zooko | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | undecided |
Component: | error-handling | Version: | 0.3.0 |
Keywords: | Cc: | Zooko, zooko@… |
Description
ax5avvay#1693941923 21:24:12.428: error during query: [CopiedFailure instance: Traceback from remote host -- Traceback (most recent call last): File "/usr/lib/python2.5/site-packages/foolscap/eventual.py", line 26, in _turn cb(*args, **kwargs) File "/usr/lib/python2.5/site-packages/foolscap/broker.py", line 536, in doNextCall d.addCallback(lambda res: self._doCall(delivery)) File "/usr/lib/python2.5/site-packages/twisted/internet/defer.py", line 191, in addCallback callbackKeywords=kw) File "/usr/lib/python2.5/site-packages/twisted/internet/defer.py", line 182, in addCallbacks self._runCallbacks() --- <exception caught here> --- File "/usr/lib/python2.5/site-packages/twisted/internet/defer.py", line 317, in _runCallbacks self.result = callback(self.result, *args, **kw) File "/usr/lib/python2.5/site-packages/foolscap/broker.py", line 536, in <lambda> d.addCallback(lambda res: self._doCall(delivery)) File "/usr/lib/python2.5/site-packages/foolscap/broker.py", line 571, in _doCall return obj.doRemoteCall(delivery.methodname, args, kwargs) File "/usr/lib/python2.5/site-packages/foolscap/referenceable.py", line 59, in doRemoteCall res = meth(*args, **kwargs) File "/usr/lib/python2.5/site-packages/allmydata/storage.py", line 1202, in remote_slot_readv msf = MutableShareFile(filename, self) File "/usr/lib/python2.5/site-packages/allmydata/storage.py", line 383, in __init__ data = f.read(self.HEADER_SIZE) exceptions.IOError: [Errno 5] Input/output error ] [Errno 5] Input/output error
I would really like to know which storage server is having I/O errors. Too bad this stack trace doesn't tell me. Any identifying information would work -- tubid or IP address would be fine.
Change History (2)
comment:1 Changed 16 years ago by
comment:2 Changed 15 years ago by
Component: | logging → error-handling |
---|
Note: See
TracTickets for help on using
tickets.
So, for background, the following sequence of events occurred:
callRemote
to some node BobaddErrback
handler, which usesfoolscap.logging.log.msg
to record the error, with a bit of code that looks something like:My usual practice is to throw enough information into the log.msg() arguments that we can figure out the context of the message later on, but obviously I failed to do that here. The sort of thing that I would add would either be the peerid that we sent the query to, or a parent= which referenced an earlier message which had that same information (and which would mention the storageindex we were querying, etc). Using parent= would mean less extra context that needed to be passed to the errback handler, but also a higher risk that the parent message might be pruned (if it were at a lower severity level). So something like this:
But clearly, a more general solution would be to see if we could get this information embedded in the
CopiedFailure
. Now, since processes can have multiple Tubs, it wouldn't be appropriate to put this in the original Failure, but the act of serializing the Failure involves a specific Tub, so that first serialization would be a good place to stuff in the tubid. Failures are serialized into bits, those bits are deserialized intoCopiedFailure
s. TheCopiedFailure
can be shipped around after the fact (CopiedFailure
to bits to a newCopiedFailure
), so it's just the initial serialization from Failure to bits where this should be performed.Also, we need a way to get the nodeid into the str() or repr() of the
CopiedFailure
. At the moment we're just subclassing Failure, and not overriding the stringification code at all, so that would need to be changed.