The daily mbox patch changes the way Mailman does its archive-to-mbox archiving, if that is enabled.
First, it it important to undertand that using this patch means you will be messing with your archived message data so IT IS VERY IMPORTANT TO BACK UP YOUR DATA before taking any irrevocable steps.
In the standard Mailman system a single UNIX mbox file called
<listname>.mbox is maintained for each list in a directory
$prefix/archives/private/<listname>.mbox and as each message is archived it is appended to that file. These files, for lists carrying a large amount of traffic, can, over time, become very unwieldy, presenting problems for disk space management.
The mailman daily mbox patch modifies Mailman's behaviour so that a sparse series of daily mbox files is used for archiving rather than a single mbox file. Each archived message is normally appended to a daily mbox file for the UTC date when the message is first archived.
The daily mbox files are named
YYYY is the year,
MM is the month (01-12) and
DD is the day (01-31), and stored in the
$prefix/archives/private/ <listname>.mbox directory.
The daily mbox files are sparse because there will only be mbox files for those dates, UTC, when messages are written to the archive.
Splitting the mbox archive into daily mbox files is intended to make the management of the disk space used for the mbox files easier. For instance, past daily files can be gzip'ed individually to save storage space. Policies to limit the time for which archive material is retained or held online can also be implemented more easily.
Amongst other changes, the patch modifies the the
arch utility. After patching, by default,
arch only processes the daily mbox files in
$prefix/archives/private/<listname>.mbox as determined by pattern matching against the names of files in that directory, with gzip'ed daily mbox also files being recognized. The revised
arch will process
gzip'ed daily mbox files although it runs soemwhat slower when doing so.
arch can process a combination of daily mbox files and other mbox files: see the script usage by running
arch with the
A new utility
$prefix/bin/split_old_mbox is provided for splitting a list's existing mbox file(s) into daily mbox files. There is no magic in the way this works and if you are turning to the patch because you are short of archive disk space you will still have to manage that problem. Splitting a very large mbox files takes a fair amount of run time and initially doubles the amount of disk space needed for the mbox - the space for the original file plus the space for the daily mboxes generated from it. However, once you have split the original file and deleted it to recover the space it occupied, you can recover more space by
gzip'ing all but the most recent daily mbox files.
Note that split_old_mbox generates new daily mbox files and allocates messages to them based on the date in the UNIX
From line immediately preceding the message in the mbox being split. It is thus important that your input mbox files are syntactically correct and can be parsed by an instance of Python's
mailbox.UnixMailbox class. mbox files produced by Mailman as mail archives should be OK but if you trying to split mbox files from some other source you may need to run the
cleanarch script or use other techniques to get a properly constructed UNIX mbox for input. For suggestions, see the Mailman FAQ or search the mailman-user archive.
Because of the way
split_old_mbox allocates messages to the daily mbox files it is producing, there may be subtle differences in this allocation from any daily mboxes used as to it whether these were produced by Mailman when initially archiving an incoming message. During initial archiving message is allocated to the current daily mbox regardless of the value assigned to the UNIX
From line written immediately prior to the message in the archive which may not give the same result when
split_old_mbox is run and the UNIX
From line controls the allocation.
You do not have to split a list's old mbox file which can stay in the same directory as the newer daily mbox files as they are created. Splitting a list's old mbox file is probably worth doing in those cases where it is too large and unwieldy for your disk space management policy.
This patch is applicable to Mailman 2.1.8 and later.
This patch modifies the
Mailman/Archiver/Archiver.py file to alter the way messages are filed in list mbox archives to use daily mbox files instead of a single mbox file. This is discussed above under the Description heading.
bin/arch script is modified to accomodate the changes in the list mbox archive files.
A new script,
bin/split_old_mbox, is added which can be used to split existing mbox archives in a series of daily mbox files.
README.dailymbox is added to the Mailman build directory containing the information given beneath the Description heading above.
Apply the patch from within the Mailman build directory using the command:
patch -p1 < path-to-patch-file
Uses the same patch as MM 2.1.10
Uses the same patch as MM 2.1.8
|Click to e-mail comments or complaints||Last updated: 09/07/2009 13:21|