Opened 16 years ago
Last modified 13 years ago
#106 closed defect
KeyError from self.myReferenceByCLID[clid] — at Version 3
Reported by: | Zooko | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | 0.6.3 |
Component: | unknown | Version: | 0.3.0 |
Keywords: | KeyError | Cc: | davidsarah |
Description (last modified by )
I don't know if this is normal or is indicative of a problem. This is from running some unit tests in tahoe-lafs.
local#4813 09:57:13.711: deep-check done local#4814 09:57:13.725: shutting down SystemTest services local#4815 09:57:13.726: Node.stopService local#4816 09:57:13.728: Node._really_stopService local#4817 09:57:13.774: Node.stopService local#4818 09:57:13.777: Node._really_stopService local#4819 09:57:13.825: Node.stopService local#4820 09:57:13.828: Node._really_stopService local#4821 09:57:13.871: Node.stopService local#4822 09:57:13.873: Node._really_stopService local#4823 09:57:13.916: Node.stopService local#4824 09:57:13.918: Node._really_stopService local#4825 09:57:13.969: Node.stopService local#4826 09:57:13.972: Node._really_stopService local#4827 09:57:14.053: an inbound callRemote that we [bduf] executed (on behalf of someone else, TubID ncie) failed local#4828 09:57:14.053: reqID=248, rref=<foolscap.broker.Broker object at 0x39653b0>, methname=RIBroker.decref local#4829 09:57:14.053: args=[] local#4830 09:57:14.053: kwargs={'count': 1, 'clid': 29} local#4831 09:57:14.053: the LOCAL failure was: FAILURE: [CopiedFailure instance: Traceback from remote host -- Traceback (most recent call last): File "/usr/local/stow/python-release25-maint-2008-12-23/lib/python2.5/site-packages/foolscap-0.3.2-py2.5.egg/foolscap/eventual.py", line 26, in _turn cb(*args, **kwargs) File "/usr/local/stow/python-release25-maint-2008-12-23/lib/python2.5/site-packages/foolscap-0.3.2-py2.5.egg/foolscap/broker.py", line 536, in doNextCall d.addCallback(lambda res: self._doCall(delivery)) File "/usr/local/stow/python-release25-maint-2008-12-23/lib/python2.5/site-packages/Twisted-8.1.0_r25700-py2.5-macosx-10.3-i386.egg/twisted/internet/defer.py", line 195, in addCallback callbackKeywords=kw) File "/usr/local/stow/python-release25-maint-2008-12-23/lib/pytho -- TRACEBACK ELIDED -- /twisted/internet/defer.py", line 186, in addCallbacks self._runCallbacks() --- <exception caught here> --- File "/usr/local/stow/python-release25-maint-2008-12-23/lib/python2.5/site-packages/Twisted-8.1.0_r25700-py2.5-macosx-10.3-i386.egg/twisted/internet/defer.py", line 328, in _runCallbacks self.result = callback(self.result, *args, **kw) File "/usr/local/stow/python-release25-maint-2008-12-23/lib/python2.5/site-packages/foolscap-0.3.2-py2.5.egg/foolscap/broker.py", line 536, in <lambda> d.addCallback(lambda res: self._doCall(delivery)) File "/usr/local/stow/python-release25-maint-2008-12-23/lib/python2.5/site-packages/foolscap-0.3.2-py2.5.egg/foolscap/broker.py", line 571, in _doCall return obj.doRemoteCall(delivery.methodname, args, kwargs) File "/usr/local/stow/python-release25-maint-2008-12-23/lib/python2.5/site-packages/foolscap-0.3.2-py2.5.egg/foolscap/referenceable.py", line 59, in doRemoteCall res = meth(*args, **kwargs) File "/usr/local/stow/python-release25-maint-2008-12-23/lib/python2.5/site-packages/foolscap-0.3.2-py2.5.egg/foolscap/broker.py", line 421, in remote_decref tracker = self.myReferenceByCLID[clid] exceptions.KeyError: 29 ]
Change History (3)
comment:1 Changed 13 years ago by
Keywords: | KeyError added |
---|
comment:2 Changed 13 years ago by
Cc: | davidsarah added |
---|
comment:3 Changed 13 years ago by
Description: | modified (diff) |
---|
The first exception is pretty normal. When two Tubs in the same process are shut down (such as at the end of a unit test), they drop references, which triggers the cross-Tub GC messages, which may be delivered after the other end has forgotten about the object being decref'ed. It's harmless, although it'd be a good idea for Foolscap to tolerate them silently: fixed in [8d1ca85].
The second one is probably similar. It indicates that a message was sent to a local object which had gone out-of-scope by the time the message was processed. The object in question was the first one allocated (clid=1, since 0 is used for the Broker itself, for remote_decref), which (if this is a storage server) is probably the RIStorageServer instance (whatever object was the target of the first inbound getReference() call).
That sort of problem *ought* to be handled by returning an error message to the remote caller, not with log.err(). The requestID is the very first token in the CALL sequence, so we know who to send the error to. I think it's a bug/oversight in call.py which allows that error to escape, and worth fixing.
I also saw this in the twistd.log of a storage server running on one of LAE's EC2 instances:
Unfortunately I don't have much context about what might have caused it, since I only noticed these log entries much later.