#104 closed defect (fixed)
'reference' tokens are occasionally broken
Reported by: | Brian Warner | Owned by: | |
---|---|---|---|
Priority: | critical | Milestone: | 0.4.0 |
Component: | banana | Version: | 0.3.0 |
Keywords: | Cc: |
Description
While investigating Tahoe#541, it seems to me that the 'reference' token is just plain broken. Specifically, the calls to setObject
are showing up with counter values which don't match up.
One test client did a remote call with the sequence:
- OPEN(4) "call" 3 1 "allocate_buckets" OPEN(5) "arguments" 5 "arg1" "arg2" OPEN(6) "set" 1 CLOSE(6) ...
and on the server side I saw setObject(5) called with a set, and no call to setObject(6).
Something is screwy, it looks like somehow the counter value is corrupted or stale when it is passed in to setObject(). A quick glance suggests that setObject is being called at the right time and with the right objects, it's just that the counter= value is wrong (usually 1 smaller than it should be)
Change History (4)
comment:1 Changed 16 years ago by
comment:2 Changed 16 years ago by
Milestone: | undecided → 0.3.3 |
---|---|
Priority: | major → critical |
comment:3 Changed 16 years ago by
Resolution: | → fixed |
---|---|
Status: | new → closed |
Fixed, in [a59482b3d45368220234f824d28d692c7ee343b7]. The deserialization code is gnarly, but I decided to refrain from a big refactoring project and instead just fix the bug with as little code as possible.
comment:4 Changed 16 years ago by
Milestone: | 0.3.3 → 0.4.0 |
---|
Ah, I think I found it. So, for review:
Now, the problem I'm seeing here is due to the Violation that occurred earlier (in the case of Tahoe #541, it was because we started using a versioning scheme which invokes a method that doesn't exist on older servers). When a Violation is raised, Foolscap discards all tokens until the current unslicer frame is exited (this prevents a resource-exhaustion attack: discarding the tokens minimizes the memory and CPU that they can consume).
The specific problem is that this token-discarding also discards the intervening OPEN tokens, which means the object counter is not incremented. For every OPEN token that was in the discarded stream, the object counter and the open counter will drift by one. All subsequent setObject() calls will get the wrong number and will put the object in the wrong place. Any 'reference' tokens will call getObject() with the correct value, and since it won't match the setObject() value, getObject() will either get the wrong thing or it won't get anything at all.
So, the fix will be to count OPEN tokens (i.e. increment the 'object counter') even when we're discarding tokens because of a Violation. The unit test needs to do:
["list",3,4]
list instead, and this will either trigger a Violation (if the schema is tight enough), or the code in remote_foo can detect that it gets a third argument of["list",3,4]
instead of the expected("tuple",1,2)