The behavior of Aspell can be changed by any number of options which
can be specified at either the command line, the environmental variable
ASPELL_CONF, a personal configuration file, or a global configuration
file. Options specified on the command line override options specified
by the environmental variable. Options specified by the environmental
variable override options specified by either of the configurations
files. Finally options specified by the personal configuration file
override options specified in the global configuration file. Options
specified in the environmental variable ASPELL_CONF, a personal configuration
file, or a global configuration file will take effect no matter how
Aspell is used which includes being used by other applications.
Aspell has three basic type of options: boolean, value,
and list. Boolean options are either enabled or
disabled, value options take a specific value, and list
options can either have entries added or removed from the list.
To enable a boolean option simply special the option with out any
corresponding value. For example to ignore accents when checking words
use ``--ignore-accents''. To disable a boolean option prefix the
option name with a ``dont-''. For example to not ignore accents
when checking words use ``--dont-ignore-accents''.
If a boolean option has a single letter abbreviation simply give the
letter corresponding to either enabling or disabling the option with
out any corresponding value. For example to consider run-together
words legal use ``-C'' or to consider them illegal use ``-B''
To specify a value option simply specify the option with its corresponding
value. For example to set the filter mode to Tex use ``--mode=tex''.
If a value option has a single letter shortcut simply specify the
single letter short cut with its corresponding value. For example
to use a large american dictionary use ``-d american-lrg''.
To add a value to the list prefix the option name with a ``add-''
and then specify the value to add. For example to add the URL filter
use ``--add-filter url''. To remove a value from a list option
prefix the option name with a ``rem-'' and then specify the value
to remove. For example to remove the URL filter use ``--rem-filter
url''. To remove all items from a list prefix the option name with
a ``rem-all'' without specify any value. For example to remove
all filters use ``--rem-all-filter''.
Aspell can also accept options via a personal or global configuration
file. The exact files to used are specified by the options per-conf
and conf respectfully but the personal configuration file
is normally ``.aspell.conf'' located in the HOME directory and
the global one is normally ``aspell.conf'' which is located in
the etc directory which is normally ``/usr/etc'' or ``/usr/local/etc''.
To find out the particular values for your particular system use ``aspell
dump config''.
Each line of the configuration file has the format:
«option» [«value»]
There may any number of spaces between the option and the value however
it can only be spaces, ie there is no '=' between the option name
and the value.
Comments may also be included by preceding them with a '#' as anything
from a ``#'' to a newline is ignored. Blank lines are also allowed.
Values set in the personal configuration file override those in the
global file. Options specified at either the command line or via an
environmental variable override those specified by either configuration
file.
To specify a boolean option simply include the option followed by
a ``true'' to enable it or a ``false'' to disable it. For
example to allow run-together words use ``run-together true''.
To specify a value option simply include the option followed by the
corresponding option. For example to set the default language to german
use ``lang german''.
To add a value to the list prefix the option name with a ``add-''
and then specify the value to add. For example to add the URL filter
use ``add-filter url''. To remove a value from a list option prefix
the option name with a ``rem-'' and then specify the value to
remove. For example to remove the URL filter use ``rem-filter url''.
To remove all items from a list prefix the option name with a ``rem-all''
without specify any value. For example to remove all filters use ``rem-all-filter''.
The environmental variable ASPELL_CONF may also be used and it overrides
any options set in the configuration file. The format of the string
is exactly the same as the configuration file except that semicolons
( ; ) are used instead of newlines.
The following is a list of available options broken down by category.
Each entry has the following format:
«option»[,«single letter abbreviations»]
(«type») «description»
Where single letter options are specified as they would appear at
the command line, ie with the preceding dash. Boolean single letter
options are specified in the following format:
-«abbreviation to enable»|-«abbreviation
to disable»
«Option» is one of the following:
boolean, string, file, dir, integer,
or list. String, file, dir, and
integer types are all value options which can only take a
specific type of value.
(dir) alternative location of language data
files. This directory is searched before data-dir. It defaults
to the same directory the actual main word list is in (which is not
necessarily dict-dir).
filter
(list) add or removes a filter
home-dir
(dir) location for personal files
ignore,-W
(integer) ignore words <= n chars
ignore-case
(boolean) ignore case when checking words
ignore-accents
(boolean)ignore accents when checking words
ignore-repl
(boolean) ignore commands to store replacement
pairs
save-repl
(boolean) save the replacement word list on save
all
lang
(string) default language to use when creating a dictionary
or all else fails
language-tag
(string) language code to use when selecting
a dictionary, it follows the same format of the LANG environmental
variable on most systems. In fact it defaults to the value of the
LANG environmental variable if it is set.
mode
(string) sets the filter mode. Mode is one if none,
url, email, sgml, or tex. (The short cut options '-e' may be used
for email, '-H' for Html/Sgml, or '-t' for Tex)
per-conf
(file) personal configuration file
personal,-p
(file) personal word list file name
prefix
(dir) prefix directory
set-prefix
(boolean) set the prefix based on executable
location (only works on Win32 and when compiled with --enable-win32-relocatable)
repl
(file) replacements list file name
keyboard
(file) the base name of the keyboard definition
file to use (see section 5.4.4)
sug-mode
(mode) suggestion mode = ultra | fast | normal
| bad-spellers (see section 5.4.5)
The following options may be used to control which dictionaries to
use and how they behave (see section 5.4.1 for more information):
master,-d
(string) base name of the main dictionary to use.
The default Aspell installation provided the following dictionaries:
american, british, and canadian.
dict-dir
(dir) location of the main word list
extra-dicts
(list) extra dictionaries to use
strip-accents
(boolean) strip accents from all words in
the dictionary
To find out the current value of all the options use the command ``aspell
dump config''. This will dump the current configuration to standard
output. The format of the contents dumped is such that it can be used
as either the global or personal configuration file.
Aspell will go through the following steps to find an appropriate
dictionary:
If the master options is set in any fashion (via the command
line, the ASPELL_CONF environmental variable, or a configuration
file) look for a dictionary of that name. If one could not be found
complain.
If the language-tag (notlang) option or
LANG environmental variable is set and master option is not then use
it (giving preference to language-tag over LANG) to search
for an appropriate dictionary. Aspell will use the same strategy that
Pspell does, which is based on the installed .pwli files, to find
an appropriate word list. For more information of how this is done
see the Pspell manual.
If 2 fails than look for a dictionary of the same name of current
setting of the lang options. This will currently work even
if the language name is invalid, but this fact should not be relied
upon as it is an implementation detail.
As with precious versions of aspell you can specify the main dictionary
to use via the -d or --master option. However as of Aspell .32 you
can now also:
Specify more than word list to use with the extra-dicts option.
Optionally have all accents striped form the word lists using strip-accents
option. This is not the same thing as the ignore-accents
option. As enabling the ignore-accents would accept both
cafe and café (notice the accent on the e), but only enabling strip-accents
would only accent cafe, even if café is in the original dictionary.
Specify strip-accents is just like using a word list with
out the accents.
Specify special ``multi'' dictionaries.
A ``multi'' dictionary is a special file which basically a list
of dictionary files to use. A multi dictionary must end is .multi
and has roughly the same format of a configuration file where the
two valid keys are add and strip-accents. The add
key is used for adding individual word lists, or other ``multi''
files. The strip-accents key is used to control if accents
are striped from the dictionaries. Unlike the global strip-accent
option this option only effects word lists that came after the option.
For example:
strip-accents yes
add english
strip-accents no
add must-accent
will strip accents from the english word list but not the must-accent
word list. If the global strip-accents option is specified the local
strip-accents options are ignored.
Aspell now provides multi dictionaries for three variates of english:
american-med, british-med, and canadian-med.
The word lists themselves all contain accented words however the strip-accents
option is enabled by default for all the individual word lists. If
you wish to use the accented words you can set the global strip-accents
option to false or create a new multi word list.
Great care has been taken so that that only one spelling for any particular
word is included in the main list. When two variants were considered
equal I randomly picked one for inclusion in the main word list. Unfortunately
this means that my choice in how to spell a word may not match your
choice. If this is the case you can try to include one of the special
variant dictionaries with the add-extra-dict option. You
can chose from english-variant-0, english-variant-1, and english-variant-2.
Each of these word lists included all the others from the previous
variant level, thus there is no need to include more than one. English-variant-0
includes most variants which are considered almost equal, english-variant-1
include variants which are also generally considered acceptable, and
english-variant-2 contains variants which are seldom used. These special
variant dictionaries are an experimental feature so please let me
know if you take advantage of them. If no one seams to be using them
I may no longer provide them in a future release of Aspell.
Many other dictionary sizes and varieties can be created. See the
scowl/ directory in the source distribution for information on the
different varieties you can create and section 4 for how
to create an individual dictionary.
5.4.2 Notes on Various Filters and Filter Modes
Aspell now has rudimentary filter support. You can either select from
individual filters or chose a filter mode. To select a filter mode
use the mode option. You may chose from none, url,
email, sgml, and tex. The default mode
is url. Individual filters can be added with the option add-filter
and remove with the rem-filter option. The currently available
filters are url, email, sgml, tex
as well as a bunch of filters which translate the text from one format
to another.
The url filter/mode skips over URL's, host names, and email
addresses. Because this filter is almost always useful and rarely
does any harm it is enabled in all modes except none. To
turn it off either select the none mode or use rem-filter
option after the desired mode is selected.
The email filter/mode skips over quoted text. It currently
does not support skipping over headers however a future version should.
In the mean time I suggest you use Aspell with Newsbody which can
be found at http://home.worldonline.dk/~byrial/newsbody/. The
option email-skip controls the number of characters that
can appear before the email quote char, the default is 10. The option
add|rem-email-quote controls the characters that are considered
quote characters, the default is ``>' and '|'.
5.4.2.4 SGML Filter/Mode
The sgml filter/mode will skip over sgml commands. It currently
does not handle nested < > unless they are in quotes. It also does
it handle the null end tag (net) minimization feature of sgml such
as
<emphasis/important/
The option add|rem-sgml-check controls which sgml tags should
always be checked. The default is ``alt''.
The option add|rem-sgml-extension controls which file extensions
are recognized as sgml/html files. The default is html, htm, php,
and sgml. The extension are not case sensitive so extensions like
.HTM will also be recognized.
The sgml mode also enables a filter which will recognize sgml charter
commands such as & and convert it into the proper iso8859-1 character.
Currently only the iso8859-1 character set is used however in future
versions it will convert it to the encoding that is specified in the
language date file. You can specifically turn on this filter by enable
the SGML&«charset»/«charset»
filter.
The tex (all lowercase) filter/mode skips over TEX commands
and parameters and/or options to certain command. It also skips over
TEX comments by default. The option [dont-]tex-check-comments
controls whether or not aspel will skip over TEX comments. The
option add|rem-tex-command controls which TEX commands
should have certain parameters and/or options also skipped over. Commands
that are not specified will have all there parameters and/or options
checked. The format for each item is
«command» «a
list of p,P,o and Os»
The first item is simple the command name. The second item controls
which parameters to skip over. A 'p' skips over a parameter while
a 'P' won't. Similar an 'o' will skip over an optional parameter while
a 'O' won't. The first letter on the list will apply to the first
parameter, the second letter will apply to the second parameter etc.
If there are more parameters than letters Aspell will simply check
them as normal. For example the option
add-tex-command rule pp
will skip over the first two parameters of the ``rule'' command
while the option
add-tex-command foo Pop
will check the first parameter of the ``foo'' command,
skip over the next optional parameter, if it is present, and will
skip over the second parameter -- even if the optional parameter
is not present -- and will check any additional parameters.
A'*' at the end of the command is simply ignored. For example the
option
enlargethispage p
will ignore the first parameter in both enlargethispage and enlargethispage*.
To remove a command simple use the rem-tex-command option.
For example
rem-tex-command foo
will remove the command foo, if present, from the list of TEX commands.
The prefix option is there to allow Aspell to easily be relocated.
Changing prefix will change all directory names relative
to the new prefix that are not explicitly set. For example if prefix
was ``/usr/local/aspell'' and dict-dir has a default
value of ``/usr/local/aspell/dict'' than changing prefix
to ``/opt/aspell'' will also change the default value of dict-dir
to ``/opt/aspell/dict''. Note that modifying prefix will only
effect the default compiled in values of directories. If a directory
option is explicitly given a value than changing the value of prefix
has no effect on that directory option.
5.4.4 Notes on Typo-Analysis and the Keyboard Definition File
Aspell .33 and better will, in general, give a higher priority to
certain misspelling which are likely to be due to typos such as ``teh''
instead of ``the'' or ``hapoy'' instead of ``happy''.
However in order to do this well Aspell needs to know the layout of
the keyboard. The keyboard definition file simply identifies keys
that are right next to each other. The file has an extension of .kbd
and each line consists of two letters corresponding to two keys that
are right next to each other. For example the line ``as'' will
indicate that 'a' and 's' are
right next to each other. If ``as'' is listed as a entry it is
not necessary to list ``sa'' as an entry as that will be done
automatically. Also by ``right next to each other'' I mean to
keys that are close enough together that it is easy to type one instead
of the other. On most keyboards this means keys that are to the left
or to the right of each other and not keys that are below or
above it.
The default for this option is normally ``standard''. However
the default can be changed via the language data file. The normal
default, ``standard'', should work well for most QWERTY like keyboard
layouts. It may need minor adjusting for foreign keyboards and will
need to be completely rewritten for a Dvorak layout. When creating
a keyboard definition file for a foreign language please keep in mind
that Aspell completely ignores accents when scoring words so that
the key 'o' and 'ö' will appear
to be the same key to aspell even if they are in fact separate keys
on your keyboard.
5.4.5 Notes on the Different Suggestion Modes
In order to understand what these suggestion modes do, a basic understanding
of how aspell works is required. See section 8 for that.
The suggestion modes are as follows.
ultra
This method will use the fastest method available to come up
with decent suggestions. This currently means that it will look for
soundslikes within one edit distance apart without doing any typo
analysis. It is slower than Ispell by a factor of 1.5 to 2 when a
single word list is used. It speed is only minor affected by the size
of the word list, if at all, but it is strongly effected by the number
of word lists use. In this mode Aspell gets about 87% of the words
from my small test kernel of misspelled words. (Go to http://aspell.sourceforge.net/testfor more info on the test kernel as well as comparisons of this version
of Aspell with previous versions and other spell checkers.)
fast
This method is like ultra except that it also performs typo
analysis unless it is turned off by setting the keyboard to none.
The typo analysis brings words which are likely to be due to typos
to the beginning of the list but slows things down by a factor of
about two. This mode should get around the same number of words that
the ultra method does.
normal
This method looks for soundslikes within two edit distance
apart and perform typo-analysis unless it is turned off. Is is around
10 times slower than fast mode with the english word list but returns
better suggestions. Its speed is directly proportional to the size
of the word list. This mode gets 93% of the words.
bad-spellers
This method also looks for soundslikes within two edit
distances apart but is more tailored for the bad speller where as
fast or normal are more tailed to strike a good balance between typos
and true misspellings. This mode never performs typo-analysis and
returns a huge number of words for the really bad spellers
who can't seam to get the spelling anything close to what it should
be. If the misspelled word looks anything like the correct spelling
it is bound to be found somewhere on the list of 100 or more
suggestions. This mode gets 98% of the words.