# $Id: TODO,v 1.11 2003/06/04 20:37:08 m-a Exp $ bogofilter TODO list **** 0.14: revise $(SHELL) macros, BSD make apparently uses the user's login shell, for example /usr/local/bin/zsh, which can cause arbitrary test failures. Make sure we have a Bourne/Korn-style shell. **** New Feature: Token aging. Support for struct data in the wordlists is already present. **** Two deletes for kmail? This wouldn't be a patch for bogofilter itself, but a change to give kmail delete-as-spam and delete-as- nonspam buttons. Similarly for other MUAs. **** New Feature: Make it a milter? **** New Feature: Multiple list file support with weights and rules. Wordlist verfification. Eric Seppanen: > Allow use of a variable number of list files, each with their > own weights and rules. > Possible uses: > - hand-maintained "whitelist" or "blacklist" files, with massive > weighting to override everything else. > - allow users to use system-wide list files and their own files. > Shared-database version based on the autodaemon code, In the shared-database version (which doesn't yet exist) worldlist verification to avoid attacks on posters (thanks, Barry!). Emulate the Vipul's Razor reputation scheme for people reporting tokens? http://razor.sourceforge.net/ **** What this software is probably heading towards is a scheme in which there's a general notion of tagged categories (spam being one) with cluster analysis being applied to categorize which categories a message belongs to at above 0.9 confidence level. **** Design Enhancement: Cascade lexers. Have one to analyze the structure and a selection of lexers to analyze the tokens. Use a functional approach to nest these lexers? **** Design Enhancement: Improve evaluation accuracy by treating words common in both wordlists as neutral by excluding them from the probability calculation when their significance to the total probability falls below a certain limit. **** New Feature: Web based tool for wordlist management. Allow message registration and whitelist management. HTML Templatized for easy integration with existing web mail systems. **** Performance Enhancement: Investigate if mmap/msync might help performance. The db4 library does it already. **** New Feature: Add support for a user configurable list of headers that should be used to ignore (single or multi-line) headers that appear in the list. The list should be used to ignore headers both during the message registration and evaluation procedures.