Opened 9 years ago
Closed 7 years ago
#247 closed task (fixed)
serialize logs with JSON, not pickle
Reported by: | Brian Warner | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | 0.13.0 |
Component: | logging | Version: | 0.9.1 |
Keywords: | Cc: |
Description
split out of #244
There are several of places which might serialize log events:
- via
LogFileObserver
, when$FLOGFILE=
is set at process startup - in a locally-written Incident file, via
foolscap.logging.log.setLogDir()
- the log gatherer and incident gatherer
- via
flogtool tail --save-to=
- in the output of
flogtool filter
All of these currently use the stdlib Pickle module to read and write serialized log events. The files (which may be BZ2-compressed, or uncompressed) contain concatenated pickle bytestrings, so they can be read by calling pickle.load(f)
repeatedly, until it throws an error.
This ticket is about replacing these pickles with something safer, namely JSON. Pickles are unsafe, and obviously I should have never used them in the first place.
My plan is to delete all the pickle-handling code and replace it with JSON-handling code (newline-separated JSON records, read with f.readlines()
and JSON.loads()
. The new version won't import pickle
at all, and the only nod to backwards compatibility will be a check that looks at the first few bytes of the logfile to see if it might be a pickle. If we see one, the tool will print an apology message which explains the unsafety, and mentions that installing an old version of Foolscap is the only way to read these logfiles.
Change History (5)
comment:1 Changed 9 years ago by
comment:2 Changed 9 years ago by
Milestone: | 0.10.0 → 0.11.0 |
---|
comment:4 Changed 8 years ago by
Milestone: | 0.12.0 → 0.13.0 |
---|
Moving these tickets from 0.12 to 0.13, since we need 0.12 out quickly to resolve #263
Two problems I've encountered so far with the de-pickle-ization task:
log.err()
stores aFailure
orCopiedFailure
object into the event buffer. This was great when we were serializing with pickle or Banana, but they aren't JSONable.We need a way to serialize the relevant data from a Failure into something JSON-compatible. We also need a good place to do the serialization. There is a clause in
log.msg()
which looks for afailure=
argument, which is basically the one polite way to pass in a Failure object (note thatlog.err()
useslog.msg(failure=)
internally). This is the most obvious place to catch Failures. The clause currently replaces anyFailure
with aCopiedFailure
, and this could be enhanced to replace it with something else.What should we replace it with? It'd be nice if the serialization were lazy. I don't know how much work is done when the Failure is created, but it'd be nice if we didn't do a lot of additional work inside
log.msg()
, since chances are good that the event will be discarded without anybody ever looking at it. If we were using Pickle, we could replace it with a wrapper object whose__getstate()__
method did the conversion. Perhaps thejson.dumps(object_hook=)
hook can help.