Opened 16 years ago

Closed 16 years ago

#40 closed enhancement (fixed)

compress flog pickles

Reported by: Zooko Owned by:
Priority: minor Milestone: 0.2.3
Component: logging Version: 0.2.2
Keywords: Cc:

Description

Running the tahoe test suite with FLOGFILE=flog.pickle and FLOGTWISTED=1 generates:

flog.pickle size: 71221968
time to bzip2 it: real    0m42.529s
bzipped filesize:  3785336

Running the tahoe test suite after patching foolscap to use HIGHEST_PROTOCOL generates:

flog.pickle size: 59686279
time to bzip2 it: real    0m37.614s
bzipped filesize:  3039383

So it might be nice to use use protocol=2 (which I guess is better than HIGHEST_PROTOCOL, so that if a new version of Python comes out with a newer protocol, and someone uses foolscap on that version of Python, then they wouldn't be able to read the resulting pickles with Python 2.5/2.6/3.0). It might also be nice to add a bzip2 filter. }}}

Change History (2)

comment:1 Changed 16 years ago by Brian Warner

Excellent idea. I got similar numbers from a quick test, 65MB for protocol=None and 48MB for protocol=-1. (the amount of output is variable, I got 52MB for proto=2, when that ought to be the same as proto=-1, right?). I shall implement both protocol=-1 and using bz2.BZ2File for output if the FLOGFILE= name ends in .bz2 .

One existing problem which will probably be exacerbated by this is buffering/flushing. Because this logger sits outside the usual application startup/shutdown code, there's no good time to tell it to close the file and therefore flush all data. Previously I was relying upon the tendency of open files to write their data out whenever necessary, but if the log events are compressed then it could be quite a while before this happens. Indeed, in running tests of the BZ2File change, I see a truncated .bz2 file that 'bzcat' chokes on.

Hrm, must think about this some more.

comment:2 Changed 16 years ago by Brian Warner

Milestone: undecided0.2.3
Resolution: fixed
Status: newclosed

ok, I implemented this (using bz2, if the FLOGFILE= name ends in .bz2), and used a reactor.addSystemEventTrigger to close the file at shutdown. This seems to work well enough for trial tests.

In the next version of foolscap (0.2.3), you'll want to run trial like so:

FLOGFILE=flog.out.bz2 FLOGTWISTED=1 trial blah

The tahoe unit tests will give you a 3MB flog.out.bz2, which actually represents something like 70-100MB of logged events. (FLOGLEVEL= defaults to '1', which gets even the NOISY messages).

Note: See TracTickets for help on using tickets.