Opened 14 years ago

Last modified 13 years ago

#147 new defect

NoneType is not callable error in RemoteReferenceTracker

Reported by: bgranger Owned by:
Priority: major Milestone: eventually
Component: unknown Version: 0.4.1
Keywords: Cc:

Description (last modified by Brian Warner)

Brian,

A few months ago, I emailed you about this. I said I would file a ticket, but lost track of it. Here is the report.

I hope things are going well. I am seeing a very odd error in foolscap occasionally. Here is what I see:

Exception exceptions.TypeError: "'NoneType' object is not callable" in <bound
method RemoteReferenceTracker._refLost of
<RemoteReferenceTracker(clid=1,url=pb://ahrpv6rtya6exuk75h2quxaubdkdubst@10.0.1.163:55071,127.0.0.1:55071/quegnf6enammggq33rgwiq3sgdt4dhh5)>>
ignored

This happens sometimes when my foolscap using process *ends*. I looked at RemoteRemoteReferenceTracker?:

    def _refLost(self, wref):
        # don't do anything right now, we could be in the middle of all sorts
        # of weird code. both __del__ and weakref callbacks can fire at any
        # time. Almost as bad as threads..

        # instead, do stuff later.
        eventually(self._handleRefLost)

    def _handleRefLost(self):
        if self.ref() is None:
            count, self.received_count = self.received_count, 0
            if count == 0:
                return
            self.broker.freeYourReference(self, count)
        # otherwise our RemoteReference is actually still alive, resurrected
        # between the call to _refLost and the eventual call to
        # _handleRefLost. In this case, don't decref anything.

I suspect the problem is in _handleRefLost, when it does if self.ref() is None. For some reason self.ref is None so calling it gives the TypeError?. Should this be protected by a "if self.ref is None"?

Your response:

Yeah, please file a bug on that one. Your analysis sounds correct, but I'd need to walk through the code to find where .ref is being manipulated, and a bug will help me not forget about it. (it'll be mid-week before I get a chance to look at it).

Thanks,

Brian Granger

Change History (4)

comment:1 Changed 14 years ago by Brian Warner

Description: modified (diff)

updated formatting

comment:2 Changed 13 years ago by Brian Warner

Milestone: 0.6.0eventually

huh, so now I'm not really sure what's going on. Certainly when the process is ending, destructors/finalizers are firing in weird orders, so all bets are off. I don't see any path by which _handleRefLost could observe self.ref to be None, but it can't hurt to add an if self.ref is None or self.ref() is None in there. [d55bc49e7e346d3ed53cdac46e02a15938b8772e] adds this.

If the exception is really happening in _refLost, though, then it implies that "eventually" has been nulled out, which would be *really* weird.

I know it's been a while, but are you still seeing this? Is it at all reproducible?

comment:3 Changed 13 years ago by Zooko

In my opinion one should not spend time maintaining code which is supposed to run when a process is ending. Instead one should design the overall system to work well even if the process terminates ungracefully (such as by kill -9 or by a kernel crash or a power outage or hardware failure), in which case code that is supposed to run when a process is ending is irrelevant. (I think this is called "crash-only software".)

I've seen a lot of time wasted trying to reproduce/diagnose/improve code that runs "at shutdown", but I've rarely seen any benefit from that sort of code.

(Brian already well understands my opinion about this, but I thought I would post it explicitly on this ticket for the record.)

comment:4 Changed 13 years ago by Brian Warner

Description: modified (diff)

well, in this case, the code exists to handle references being GCed *before* the process shuts down. It's just Python's eagerness to cleanup everything at shutdown (which is counter to your recommended crash-only approach) which happens to be firing our Referenceable's destructors. If we had a way to know that the whole process was being torn down soon anyways, we could change the code to just skip these destructors (no need to send a "you can drop reference 12 now" message when the TCP connection is about to be closed). But I don't know how to detect that.

Note: See TracTickets for help on using tickets.