Opened 17 years ago
Closed 17 years ago
#40 closed enhancement (fixed)
compress flog pickles
Reported by: | Zooko | Owned by: | |
---|---|---|---|
Priority: | minor | Milestone: | 0.2.3 |
Component: | logging | Version: | 0.2.2 |
Keywords: | Cc: |
Description
Running the tahoe test suite with FLOGFILE=flog.pickle
and FLOGTWISTED=1
generates:
flog.pickle size: 71221968 time to bzip2 it: real 0m42.529s bzipped filesize: 3785336
Running the tahoe test suite after patching foolscap to use HIGHEST_PROTOCOL generates:
flog.pickle size: 59686279 time to bzip2 it: real 0m37.614s bzipped filesize: 3039383
So it might be nice to use use protocol=2 (which I guess is better than HIGHEST_PROTOCOL, so that if a new version of Python comes out with a newer protocol, and someone uses foolscap on that version of Python, then they wouldn't be able to read the resulting pickles with Python 2.5/2.6/3.0). It might also be nice to add a bzip2 filter. }}}
Change History (2)
comment:1 Changed 17 years ago by
comment:2 Changed 17 years ago by
Milestone: | undecided → 0.2.3 |
---|---|
Resolution: | → fixed |
Status: | new → closed |
ok, I implemented this (using bz2, if the FLOGFILE= name ends in .bz2), and used a reactor.addSystemEventTrigger to close the file at shutdown. This seems to work well enough for trial tests.
In the next version of foolscap (0.2.3), you'll want to run trial like so:
FLOGFILE=flog.out.bz2 FLOGTWISTED=1 trial blah
The tahoe unit tests will give you a 3MB flog.out.bz2, which actually represents something like 70-100MB of logged events. (FLOGLEVEL= defaults to '1', which gets even the NOISY messages).
Excellent idea. I got similar numbers from a quick test, 65MB for protocol=None and 48MB for protocol=-1. (the amount of output is variable, I got 52MB for proto=2, when that ought to be the same as proto=-1, right?). I shall implement both protocol=-1 and using bz2.BZ2File for output if the FLOGFILE= name ends in .bz2 .
One existing problem which will probably be exacerbated by this is buffering/flushing. Because this logger sits outside the usual application startup/shutdown code, there's no good time to tell it to close the file and therefore flush all data. Previously I was relying upon the tendency of open files to write their data out whenever necessary, but if the log events are compressed then it could be quite a while before this happens. Indeed, in running tests of the BZ2File change, I see a truncated .bz2 file that 'bzcat' chokes on.
Hrm, must think about this some more.