What: This document explains how to upgrade bogofilter's wordlist files from any earlier version to the current version. Note: if you've never used a version of bogofilter older than 0.8, you don't need to use bogoupgrade. There are two possible upgrades paths: 1. Delete current wordlist files and regenerate them from your email corpus. This is the recommended upgrade for the upcoming versions (0.75-beta, and 0.75) since the definition of a 'word' has now changed. Regeneration will pick up more words and also create the wordlist files with the correct format. 2. If you choose not to regenerate your wordlists, you MUST upgrade the formats of the existing wordlist files. Assumptions: Recent version of bogofilter package is installed and the programs /usr/bin/bogoutil and /usr/bin/bogoupgrade exist. Adjust paths to suit your taste and system. How: 1. Stop all instances of bogofilter. While the upgrade tools lock the database files, the upgrade may take a long time if you have a busy site. Don't forget to stop cron jobs or daemons that fetch and process mail and could fire off bogofilter. 2. Backup your data. Let's assume that you said: $ mv ~/.bogofilter ~/.bogofilter.safe $ mkdir ~/.bogofilter 3. If your bogofilter version is less than 0.7, say $ /usr/bin/bogoupgrade -b /usr/bin/bogoutil -i ~/.bogofilter.safe/goodlist -o ~/.bogofilter/goodlist.db $ /usr/bin/bogoupgrade -b /usr/bin/bogoutil -i ~/.bogofilter.safe/badlist -o ~/.bogofilter/spamlist.db If your bogofilter version is 0.7 or greater, say $ /usr/bin/bogoupgrade -b /usr/bin/bogoutil -i ~/.bogofilter.safe/hamlist.count -o ~/.bogofilter/goodlist.db $ /usr/bin/bogoupgrade -b /usr/bin/bogoutil -i ~/.bogofilter.safe/spamlist.count -o ~/.bogofilter/spamlist.db 4. Done. Restart any stopped daemons, cron tasks, etc. Why: Versions 0.1? to 0.6 uses a text file for message counts and data. The first line contains a signature and a message count. Subsequent lines contain space separated word/count pairs, each followed by a newline. Here are the first few lines of a sample file: # bogofilter wordlist (format version A): 798 word 5 otherword 4 yetanotherword 4 Versions 0.7 to 0.7.4 uses two files for each list: A text file for the message counts, and a Berkeley DB file for the word/count values. Here is a sample signature message count file: # bogofilter email-count (format version B): 1077 Versions 0.7.5+ use a single Berkeley DB file to hold both word and message counts. A record with the special key, '.MSG_COUNT' is used for the message count. The text file unused. Note: That some people may have applied a patch to version 0.70 which has a similar effect, but uses a key value of '.count' for the message count. This type will also be correctly upgraded. Who: Gyepi Sam Any problems should first be addressed to the bogofilter lists, to which I am subscribed.