#LyX 1.1 created this file. For more info see http://www.lyx.org/
\lyxformat 218
\textclass book
\begin_preamble
\usepackage[T1]{fontenc}
\usepackage{xspace}
\newcommand{\nach}{$\to$\xspace}
\newcommand{\hoch}{\texttt{$^\wedge$}}
\usepackage{html}
\newcommand{\doubledash}{-\hspace{0.1em}-}
\newcommand{\doubledashb}{-\/-}
\newcommand{\dlt}{{\footnotesize$\ll$}}
\newcommand{\dgt}{{\footnotesize$\gg$}}
\begin{htmlonly}
\renewenvironment{lyxcode}
{\begin{list}{}{
\setlength{\rightmargin}{\leftmargin}
\raggedright
\setlength{\itemsep}{0pt}
\setlength{\parsep}{0pt}
\ttfamily}%
\item[]
\begin{ttfamily}}
{\end{ttfamily}
\end{list} }
\newenvironment{LyXParagraphIndent}[1]%
{\begin{quote}}
{\end{quote}}
\renewcommand{\LyX}{LyX}
\renewcommand{\doubledash}{\rawhtml --\endrawhtml}
\renewcommand{\doubledashb}{\rawhtml --\endrawhtml}
\renewcommand{\dlt}{«}
\renewcommand{\dgt}{»}
\renewcommand{\nach}{\rawhtml to \endrawhtml}
\renewcommand{\hoch}{\rawhtml ^\endrawhtml}
\end{htmlonly}
\end_preamble
\language english
\inputencoding latin1
\fontscheme default
\graphics default
\paperfontsize 11
\spacing single
\papersize letterpaper
\paperpackage a4
\use_geometry 1
\use_amsmath 0
\paperorientation portrait
\leftmargin 1in
\topmargin 1in
\rightmargin 1in
\bottommargin 1in
\secnumdepth 3
\tocdepth 2
\paragraph_separation skip
\defskip medskip
\quotes_language english
\quotes_times 2
\papercolumns 1
\papersides 1
\paperpagestyle default
\layout Title
Aspell .33.7.1 alpha
\size larger
\newline
\series bold
\size large
A More Intelligent Ispell
\layout Author
Kevin Atkinson
\newline
kevina at users sourceforge net
\layout Standard
\begin_inset LatexCommand \tableofcontents{}
\end_inset
\layout Chapter
Introduction
\layout Standard
Aspell is an Open Source spell checker designed to eventually replace Ispell.
Its main feature is that it does a much better job of coming up with possible
suggestions than Ispell does.
In fact recent tests shows that it even does better than Microsoft Word
97's spell checker in some cases.
In addition it has both compile time and run time support for other non-English
languages.
Aspell is also a library however the recommend way to use aspell is through
the Pspell library as the actual interface to the aspell library is constantly
changing.
\layout Section
The Future of Aspell
\begin_inset LatexCommand \label{future}
\end_inset
\layout Standard
Aspell .33.7 is most likely going to be the last official version of Aspell
as in the near future Aspell is going to be merged into Pspell:
\layout Standard
\series bold
\size large
From:
\series default
Kevin Atkinson
\newline
\series bold
To:
\series default
aspell-announce, pspell-announce
\newline
\series bold
Date:
\series default
08/01/2001
\newline
\series bold
Subject:
\series default
Aspell and Pspell will be merged.
\layout Standard
In the near future Aspell is going to be merged into Pspell.
This will happen with the next major release of Pspell.
If everything goes as planed, in the next major version of Pspell:
\layout Itemize
Aspell will be included as part of Pspell as a module
\layout Itemize
The Ispell module will also be included
\layout Itemize
Most of the functionally of the Aspell command will be replaced with a new
Pspell command which will work for any spell checker
\layout Itemize
The English dictionary will no longer be included with Aspell.
Instead it will be part of the Aspell-dicts package
\layout Itemize
The manuals will be switched over from LyX/LaTeX to LyX/DocBook so that
in can easily be converted to other formats such as info and man.
\layout Itemize
The current C++ interface may disappear and be replaced with a nicer C++
interface which will act as a wrapper for the C interface as I discussed
earlier.
If you are currently using the C++ interface and this will create a major
problem for you please let me know.
\layout Standard
This will require major reorganization of just about anything associated
with Aspell and Pspell which will probably require packagers to redo there
packaging scheme for Aspell and Pspell as some libraries and data files
will disappear, other will be created, and the location of some of the
data files will be moved.
Also, as mentioned above Aspell, as you know it know, is going to disappear
and be replaced with a generic utility.
Some Aspell specific things, such as creating dictionaries, will remain
in the Aspell utility however the executable will be renamed to aspell-util
to avoid any confusion.
\layout Standard
Also, the Aspell sourceforge project might disappear, however I am not sure
about this.
At very least the Aspell mailing lists and bug tracker will disappear to
avoid confusion on where question and bug reports should be posted.
\layout Standard
All of this
\emph on
should
\emph default
happen around the end of August.
\layout Section
Comparison to other spell checker engines
\layout Standard
\added_space_top 0.3cm \added_space_bottom 0.3cm \align center
\begin_inset Tabular
\begin_inset Text
\layout Standard
\end_inset
|
\begin_inset Text
\layout Standard
Aspell
\end_inset
|
\begin_inset Text
\layout Standard
Ispell
\end_inset
|
\begin_inset Text
\layout Standard
Netscape 4.0
\end_inset
|
\begin_inset Text
\layout Standard
Microsoft Word 97
\end_inset
|
\begin_inset Text
\layout Standard
Open Source
\end_inset
|
\begin_inset Text
\layout Standard
x
\end_inset
|
\begin_inset Text
\layout Standard
x
\end_inset
|
\begin_inset Text
\layout Standard
\end_inset
|
\begin_inset Text
\layout Standard
\end_inset
|
\begin_inset Text
\layout Standard
Suggestion Intelligence
\end_inset
|
\begin_inset Text
\layout Standard
88-98
\end_inset
|
\begin_inset Text
\layout Standard
54
\end_inset
|
\begin_inset Text
\layout Standard
55-70?
\end_inset
|
\begin_inset Text
\layout Standard
71
\end_inset
|
\begin_inset Text
\layout Standard
Personal part of Suggestions
\end_inset
|
\begin_inset Text
\layout Standard
x
\end_inset
|
\begin_inset Text
\layout Standard
x
\end_inset
|
\begin_inset Text
\layout Standard
x
\end_inset
|
\begin_inset Text
\layout Standard
\end_inset
|
\begin_inset Text
\layout Standard
Alternate Dictionaries
\end_inset
|
\begin_inset Text
\layout Standard
x
\end_inset
|
\begin_inset Text
\layout Standard
x
\end_inset
|
\begin_inset Text
\layout Standard
?
\end_inset
|
\begin_inset Text
\layout Standard
?
\end_inset
|
\begin_inset Text
\layout Standard
International Support
\end_inset
|
\begin_inset Text
\layout Standard
x
\end_inset
|
\begin_inset Text
\layout Standard
x
\end_inset
|
\begin_inset Text
\layout Standard
?
\end_inset
|
\begin_inset Text
\layout Standard
?
\end_inset
|
\end_inset
\layout Standard
The suggestion Intelligence is based on a small test kernel of misspelled/correc
t word pairs.
Go to
\begin_inset LatexCommand \url{http://aspell.sourceforge.net/test}
\end_inset
for more info and how you can help contribute to the test kernel.
The current scores for aspell are 88 in
\series bold
fast
\series default
mode, 93 in
\series bold
normal
\series default
mode, and 98 in
\series bold
bad spellers
\series default
mode see section
\begin_inset LatexCommand \ref{suggestion}
\end_inset
for more information about the various suggestion modes.
\layout Standard
If you have any other information you would like to add to this chart please
contact me at kevina at users sourceforge net.
\layout Subsection
Comparison to Ispell
\layout Subsubsection
Features that only Aspell has
\layout Itemize
Does a much better job with coming up with suggestions than Ispell does
or for that matter any other spell checker I have seen.
If you know a spell checker that does a better job please let me know.
\layout Itemize
Can learn from users misspellings.
\layout Itemize
Is an actual library that others programs can link to instead of having
to use it through a pipe.
\layout Itemize
Is multiprocess intelligent.
When a personal dictionary (or replacement list) is saved it will now first
update the list against the dictionary on disk in case another process
modified it.
\layout Itemize
Can share the memory used in the main word list between processes.
\layout Itemize
Support for detachable dictionaries
\series bold
\series default
so that more than one aspell class can use the same dictionary.
\layout Itemize
Support for multiple personal dictionaries as well as support for special
auxiliary dictionaries.
\layout Itemize
Better support for run-together words.
\layout Itemize
Ability to use multiple dictionaries by simply specifying it on the command
line or in the configuration files.
\layout Subsubsection
Things that, currently, only Ispell have
\layout Itemize
Lower memory footprint
\layout Itemize
Support for affix compression
\layout Itemize
Perhaps better support for spell checking (La)TeX files.
\layout Itemize
Support for spell checking Nroff files.
\layout Chapter
Getting Started
\layout Section
Requirements
\begin_inset LatexCommand \label{reqs}
\end_inset
\layout Standard
Aspell requires gcc 2.95 (or better) as the C++ compiler.
Other C++ compilers should work with some effort.
Other C++ compilers for mostly POSIX compliant (Unix, Linux, BeOS, CygWin)
systems should work with out any major problems provided that the compile
can handle all of the advanced C++ features Aspell uses.
C++ compilers for non-Unix systems might work but it will take some work.
Aspell at very least requires a Unix-like environment (sh, grep, sed, tr,
etc...) in order to build.
Aspell also uses a few POSIX functions when necessary.
Nevertheless, Aspell will compile and run using the MinGW version of gcc
provided that the CygWin environment is used to to build it.
\layout Standard
\series bold
Aspell also requires the Portable Spell Checker Interface Library
\series default
, otherwise known as Pspell, to be installed on your system, in the same
location as Aspell will be installed in, before it will compile.
Aspell requires version .12.2 or better.
You can obtain the latest version of Pspell from
\begin_inset LatexCommand \url{http://pspell.sourceforge.net/}
\end_inset
\layout Section
Obtaining
\layout Standard
The latest version can always be found at Aspell's home page at
\begin_inset LatexCommand \url{http://aspell.sourceforge.net}
\end_inset
.
\layout Section
Support
\layout Standard
Support for Aspell can be found on the Aspell mailing lists.
Instructions for joining the various mailing lists (and an archive of them)
can be found off the Aspell home page at
\begin_inset LatexCommand \url{http://aspell.sourceforge.net}
\end_inset
.
Please use aspell-help for problems compiling and installing aspell, and
aspell-user for general questions.
\layout Section
Helping Out
\layout Standard
The easiest thing you can do to help out it is too send me your
\series bold
.aspell.<>.prepl
\series default
file located in your home directory every so often.
(Email kevina at users sourceforge net) The file contains data on which
word pairs aspell is unable to come up with the proper suggestion.
If the file does not exist is simply means that you were only using aspell
in a program such as emacs which does not communicate the replacement pairs
back to aspell.
Sending me the file would help me improve aspell accuracy as right now
I don't have very much real data to work with.
\layout Standard
If you are a good programmer and really want to help me consider doing one
of the tasks listed in a recent post:
\layout Standard
\series bold
\size large
From
\series default
: Kevin Atkinson
\newline
\series bold
Date
\series default
: 07/18/2001
\newline
\series bold
Subject
\series default
: Serious Help Needed for Aspell and Pspell
\layout Standard
In the past I have asked for help, but in specific areas and not very forcefully.
Well, now I
\emph on
really
\emph default
would appreciate some help with developing Aspell and Pspell.
I would like for Aspell to go to beta by the end of this summer but I really
don't see that happening unless I get some help.
\layout Standard
Some of the areas I could use assistance in:
\layout Itemize
Adding Affix compression support for Aspell
\layout Itemize
Adding gettext support to Aspell and Pspell
\layout Itemize
Making Aspell and Pspell thread safe
\layout Standard
And lots of other smaller areas.
I am looking for people who are competent programmers and have some experience
with C++.
I am also looking for people which have a good deal of shell programming,
Perl, and autotools experience to help me clean up my build system.
\layout Standard
Some areas such as thread safety will simply not happen until I get some
help because I really do not know enough about it.
\layout Standard
I hate to whine.
However, I have put an extremely large about of unpaid time into Aspell
and would appreciate some giving back from those who use Aspell in there
distribution or have used the Pspell library as part of there applications,
especially with those with the resources to do so such as Open Source companies.
\layout Standard
I release that my code, especially Aspell's, can be a bit scary with all
the complex C++, however I will be glade to help any one with this part
and for most of the tasks I need help with, one really don't even need
to know C++ that well.
Here is a breakdown of the tasks I would like help with and the skills
required.
\layout Standard
\series bold
\shape italic
Task
\series default
: Adding Affix compression support for Aspell
\layout Standard
\series bold
Skills
\series default
: Competent programmer with some C++ knowledge.
I know exactly what needs to be done in this area so it will require very
little knowledge of how Aspell works.
\layout Standard
\series bold
\shape italic
Task
\series default
: Adding gettext support to Aspell and Pspell
\layout Standard
\series bold
Skills
\series default
: Competent programmer with C++ knowledge and gettext knowledge.
I don't know what needs to be done in this area however they shouldn't
have to study Aspell that intensely as most of my text strings are concentrated
together.
They may also need to rework some of my strings into format strings (ie
convert cout < "Bla.." < name < "..bla.." to something like "Bla..
$1 ..bla..") so that it can be translated correctly.
However I have an idea how this should be done.
\layout Standard
\series bold
\shape italic
Task
\series default
: Making Aspell and Pspell thread safe.
\layout Standard
\series bold
Skills
\series default
: Competent programmer with good C++ and thread safety knowledge.
This is the area where the C++ skills will probably be needed the most.
Even though Aspell itself is not multi-threaded I would like it to be thread
safe so that it can be used by multi-threaded programs.
I have several areas that are potently thread unsafe (such as accessing
a global pool) and need some one to give me advice on how to best do this.
I also have several classes which have the potential of being used by more
than one thread (such as the personal dictionary) and needs to be made
thread safe.
\layout Standard
\series bold
\shape italic
Task
\series default
: Cleaning up the build system with Aspell and Pspell and similar tasks
\layout Standard
\series bold
Skills
\series default
: Competent shell programmer (preferably someone with more knowledge than
I have) and a good knowledge of how GNU autoconf, automake, and libtool
work.
\layout Standard
\series bold
\shape italic
Task
\series default
: Help with making the Aspell Dicts build system more robust
\layout Standard
\series bold
Skills
\series default
: Competent shell programmer (preferably someone with more knowledge than
I have) and a good enough knowledge of Perl to understand everything in
in my "proc" script (
\begin_inset LatexCommand \url{http://aspell.sourceforge.net/ aspell-gen-0.9.tar.bz2}
\end_inset
).
\layout Standard
\series bold
\shape italic
Task
\series default
: Miscellaneous other programming Tasks as I (or someone else) thinks of
them
\layout Standard
\series bold
Skills
\series default
: Competent programmer with good C++ knowledge.
\layout Standard
Once Aspell is complete (ie gets into a beta state) you are really going
to appreciate it.
It will be able to do everything ispell can do and a
\emph on
lot
\emph default
more.
However, I need some help in getting there.
\layout Standard
With out some serious help it is highly likely that Aspell will not be able
to be complete (ie reach a beta state) till next summer as I won't be able
to work on Aspell a great deal once School starts up again.
\layout Standard
Thanks in advance for anyone who can offer be some help.
\layout Standard
---
\newline
Kevin Atkinson
\newline
kevina at users sourceforge net
\newline
\begin_inset LatexCommand \url{http://www.ibiblio.org/kevina/}
\end_inset
\layout Section
Compiling & Installing
\layout Subsection
Generic Install Instructions
\layout Standard
Before Aspell is compiled Pspell must be installed.
You can obtain the latest version of Pspell from
\begin_inset LatexCommand \url{http://pspell.sourceforge.net/}
\end_inset
.
Both Pspell and Aspell must have the same prefix directory in order to
function correctly.
\layout Standard
Once Pspell is installed and you have read the sections below to take care
of any special requirements for you system simply type
\layout Quote
./configure && make
\layout Standard
or
\layout Quote
./configure --disable-static && make
\layout Standard
to avoid making the static libraries on a system that supports shared libraries.
For additional configure options type ./configure --help.
You can control what C++ compiler is used by setting the environmental
variable CXX before running configure and you can control what flags are
passed to the C++ compile via the environmental variable CXXFLAGS.
\layout Standard
Aspell should then compile with out any additional user intervention.
If you run into problems please first check the sections below as that
might solve your problem.
If it doesn't please post a message to the aspell-help mailing list with
the compiler, system you are using and any error messages that were produced.
You can find more info on the aspell-help mailing list info page at
\begin_inset LatexCommand \url{http://lists.sourceforge.net/lists/listinfo/aspell-help}
\end_inset
.
\layout Standard
To install the program simply type
\layout Quote
make install
\layout Standard
And that's all there is too it for a basic installation.
\layout Standard
If you do not have Ispell or the traditional Unix
\begin_inset Quotes eld
\end_inset
spell
\begin_inset Quotes erd
\end_inset
utility installed on your system than you should also copy the compatibly
scripts
\begin_inset Quotes eld
\end_inset
ispell
\begin_inset Quotes erd
\end_inset
and
\begin_inset Quotes eld
\end_inset
spell
\begin_inset Quotes erd
\end_inset
located in the scripts/ directory into your binary directory which is usually
/usr/local/bin so that programs that expect ispell or spell command will
work correctly.
\layout Subsection
General Problems
\layout Standard
Aspell requires a specific version of Pspell.
If the wrong version of Pspell is installed Aspell will not compile correctly.
So before you try anything else make sure you are using the correct version
of Pspell as stated in requirements (
\begin_inset LatexCommand \ref{reqs}
\end_inset
) section.
\layout Standard
Aspell does not use a released version of GNU Libtool.
In previous versions of aspell this will often create problems if you inadverte
ntly modify a file which causes Libtool to be called.
However, as of Aspell .33.6.1 this should no longer be a problem and automake,
autoconf, or libtool should not be called unless you specifically call
them or if you configure Aspell with --enable-maintainer-mode.
If you do notice any of these programs being called (and you did not configure
with --enable-maintainer-mode) please let me know about it.
If you have a need to modify configure.in or any of the Makefile.am's you
should install the multi-language-branch of the CVS version of libtool.
\layout Subsection
Curses Notes
\layout Standard
If you are having problems compiling termios.cc than the most likely reason
is due to incompatibilities with the curses implementation on your system.
If this is the case than you can explicitly disable the curses library
with --disable-curses.
By doing this you will lose the nice full screen interface but hopefully
you will be able to at least get Aspell to compile correctly.
\layout Standard
If the curses library is installed in a non-standard location than you can
specify the library and include directory with --enable-curses=<>
and --enable-curses-include=<>.
\series bold
Lib
\series default
can either be the complete path of the library (for example
\begin_inset Quotes eld
\end_inset
/usr/local/curses/libcurses.a
\begin_inset Quotes erd
\end_inset
), the name of the library (for example
\begin_inset Quotes eld
\end_inset
ncurses
\begin_inset Quotes erd
\end_inset
) or a combined location and library in the form
\begin_inset Quotes eld
\end_inset
-L<> -l<>
\begin_inset Quotes erd
\end_inset
(for example
\begin_inset Quotes eld
\end_inset
-L/usr/local/ncurses/lib -lncurses
\begin_inset Quotes erd
\end_inset
).
\series bold
Dir
\series default
is the location of the curses header files (for example
\begin_inset Quotes eld
\end_inset
/usr/local/ncurses/include
\begin_inset Quotes erd
\end_inset
).
\layout Subsection
Win32 Notes
\layout Standard
Aspell is now able to compile on Win32 platforms using the Win32 version
of gcc.
Aspell .30.1 can either be compiled with the Cgiwin or the Mingw version
of Gcc 2.95 using the Cgiwin development environment.
The Mingw version of Aspell will have slightly less functionality, but
none of which is noticeable to the end user.
In order to get the nice full screen interface with Mingw when spell checking
files a curses implementation that does not require Cygwin is required.
The PDCurses (
\begin_inset LatexCommand \url{http://www.lightlink.com/hessling/PDCurses/}
\end_inset
) implementation is known to work, other implementations may work however
they have not been tested.
See the previous section for information on specifying the location of
the curses library and include file.
\layout Standard
When compiling Pspell I recommend you configure with --disable-shared and
--disable-ltdl.
Shared libraries won't work correctly anyway on Win32 and trying to compile
the ltdl library can lead to unnecessary complications.
When compiling Aspell I recommend you configure with --disable-shared.
If you are planning to use Aspell outside if the Cygwin environment I strongly
recommend you install Aspell in its own location (ie prefix is not /usr/local/)
and compile it with --enable-win32-relocatable.
Please note that Pspell and Aspell must be installed in the same location.
(ie don't install Pspell in c:/pspell and Aspell in c:/aspell.
Instead install them both in c:/aspell)
\layout Standard
If Aspell is compiled with --enable-win32-relocatable and the
\series bold
bindir
\series default
is set to the same value as
\series bold
prefix
\series default
(ie not <>/bin) then the Aspell directory (what prefix is set to)
can be relocated anywhere provided that none of the data files are moved
around within the Aspell directory.
\layout Standard
The default paths for Aspell are designed for a Unix system and not a Win32
system so you might want to specify different ones when compiling Aspell.
Also if the HOME environmental variable is not set Aspell will assume it
is the current working directly.
This may lead to your personal word lists being saved in unpredictable
locations.
To solve this either compile with --enable-win32-relocatable (see above)
or specify the complete path of the personal and replacement word lists
in aspell.conf.
If Aspell is compiled with --enable-win32-relocatable than the personal
word lists are saved in the
\series bold
prefix
\series default
directory and the name is changed from
\begin_inset Quotes eld
\end_inset
\family typewriter
.aspell..*
\family default
\begin_inset Quotes erd
\end_inset
to
\begin_inset Quotes erd
\end_inset
\family typewriter
.*
\family default
\begin_inset Quotes erd
\end_inset
.
\layout Subsection
Egcs 1.1 Notes
\layout Standard
Aspell should now be able work with Egcs 1.1 but I have not been able to
actually test it.
If you have any luck one way or the other please let me know.
\layout Subsection
Upgrading from version .33
\layout Standard
Even though .33.5 is a minor release it will unfortunately break binary compatibil
ity with Aspell .33 due to the extensive changes needed to make Aspell better
C++ compliant.
This means applications such as Gaspell will need to be recompiled.
\layout Subsection
Upgrading from version .32.6
\layout Standard
I have expanded the medium (*-med) word lists and decided to eliminate the
large word lists (*-lrg) for now.
However, the installing process will not automatically remove the large
word lists so if you don't want them hanging around you should delete them
your self.
To remove all the files remove the following files from <>/aspell:
\layout Quote
\align left
american-lrg-only british-lrg-only canadian-lrg-only english-lrg-only american-l
rg.multi british-lrg.multi canadian-lrg.multi
\layout Standard
and the following files from <>/pspell:
\layout Quote
\align left
en-american-lrg-aspell.pwli en-canadian-lrg-aspell.pwli en-british-lrg-aspell.pwli
\layout Subsection
Upgrading from version .32.1
\layout Standard
Even though .32.5 is a minor release it breaks binary compatibly which means
applications such as Gaspell will need to be recompiled.
\layout Subsection
Upgrading from version .31.1
\layout Standard
The format and name of the main dictionary has changed yet again.
The install process will over write the old version, so unless you are
using dictionaries other than the one provided with aspell or want to have
multiple versions of aspell installed you should not have to worry about
this.
\layout Standard
The apostrophe (') is no longer considered part of the word if it appears
at the end of a word.
This means that you may have to manually remove words from your personal
word list if you get a message similar to:
\layout Quote
Invalid word "dogs'": The character ''' may not appear at the end of a word.
\layout Standard
To remove the word simply delete the line containing the word form the personal
word list ( normally called
\begin_inset Quotes eld
\end_inset
.aspell.english.pws
\begin_inset Quotes erd
\end_inset
).
\layout Standard
Aspell now uses a completely new word list.
This means that same words that were in the original word list may no longer
appear in the current one.
You may now also chose from American, British, and Canadian spelling and
from two sizes medium and large.
See section
\begin_inset LatexCommand \ref{dict-opts}
\end_inset
for more information on choosing among the different choices.
The original source that the word lists were created from is now found
under the scowl/ directory.
\layout Subsection
Upgrading from version .30
\layout Standard
The format of the main dictionary file has changed a bit.
If you were able to use Aspell .30 then the old format should work.
The only time the old format will NOT work is in the rare case the mmap
fails.
Previous version of aspell will just abort with an error when the mmap
fails but the new version will attempt to read in the file using fread.
Fread will fail with the old version of the main word list.
\layout Subsection
Upgrading from version .29.1
\layout Standard
The format (but not the name) of the main dictionary has changed yet again.
The install process will over write the old version, so unless you are
using dictionaries other than the one provided with aspell or want to have
multiple versions of aspell installed you should not have to worry about
this.
\layout Standard
Aspell also now depends on Portable Spell Checker Interface Library otherwise
known as Pspell.
Pspell must be installed before aspell will compile, you can find it at
\begin_inset LatexCommand \url{http://pspell.sourceforge.net/}
\end_inset
.
\layout Subsection
Upgrading from version .29
\layout Standard
The format and name of the main dictionary has changed which means it will
need to be recompiled.
The install process will remove the old files for you, so unless you are
using dictionaries other than the one provided with aspell or want to have
multiple versions of aspell installed you should not have to worry about
this.
\layout Subsection
Upgrading from version .28.3
\layout Standard
Aspell now uses namespaces which means egcs 1.0 and gcc 2.8 will no longer
cut it.
If this becomes a serious problem let me know as it should not be to difficult
to get it working again with egcs 1.0 and gcc 2.8.
\layout Standard
Due to the new soundslike code the the main dictionary will need to be recompile
d.
The build process does this automatically so unless you want to have more
than one version of aspell around you should not need to worry about this.
\layout Standard
The format and file name of the personal dictionaries has also changed.
In most cases aspell will automatically detect this and convert it for
you by using the following algorithm.
\layout Enumerate
If no file exist of *.pws (for the personal word list) or *.prepl (for the
personal replacement list) aspell will look for *.per or *.rpl respectfully.
If that file is found it will read in the data using the old format.
\layout Enumerate
When saving the dictionary it will save is as *.pws or *.prepl respectfully.
\layout Enumerate
Once saved as the new format it will delete the old file.
\layout Standard
If you have an older version of aspell around you can restore the old dictionari
es by using these command
\layout Quote
aspell.new dump personal | aspell.old create personal
\layout Quote
aspell.new dump repl | aspell.old create repl
\layout Standard
The new version of aspell will then leave the old files alone as long as
*.pws and *.prepl exist.
\layout Standard
Also, if the file does not end in .pws or .prepl it will try to read it in
using the new format and if that fails it will read in the old format.
When the file is saved it will be saved as the new format.
\layout Standard
I am hopping I will not have to change the format of the personal dictionaries
again.
However, the main word list however is very likely to change in format.
\layout Subsection
Upgrading from version .28.2.1
\layout Standard
The behavior of
\begin_inset Quotes eld
\end_inset
aspell check
\begin_inset Quotes erd
\end_inset
changed so that it will now over right the original file as creating new
file was creating too many problems when used with programs like pine and
vi.
\layout Subsection
Upgrading from version .27.2
\layout Standard
The name of the personal word lists have changes from .aspell.per and .aspell.rpl
to .aspell.<>.per and .aspell.<>.rpl respectively.
<> is is the language name which will generally be
\begin_inset Quotes eld
\end_inset
english
\begin_inset Quotes erd
\end_inset
.
If you wish to use your old word lists you will need to rename those files.
\layout Subsection
Upgrading from version .25
\layout Standard
The format of the personal replacement dictionary has changed.
So, you will either need to rename or remove the file
\series bold
.aspell.rpl
\series default
located in your home directory.
If you have information in this file you would like to preserve please
send me an email.
\layout Subsection
Upgrading from version .24
\layout Standard
Because the location of the main word list moved you should probably do
a
\series bold
make uninstall
\series default
(with the old version of a Aspell) before upgrading to remove the old word
lists.
A make uninstall will not remove any personal word lists.
\layout Chapter
Basic Usage
\layout Standard
For a quick reference on the Aspell utility use the command
\begin_inset Quotes eld
\end_inset
aspell --help
\begin_inset Quotes erd
\end_inset
.
\layout Section
Spellchecking Individual Files
\begin_inset LatexCommand \label{check}
\end_inset
\layout Standard
To use Aspell to spellcheck a file type in
\layout Quote
aspell check [<>] <>
\layout Standard
at the command line where <> is the file you want to check and
<> is any number of optional options.
Some of the more useful ones include:
\layout Description
--mode=< the mode to use when checking files.
The available modes are none, url, email, sgml, or tex.
See section
\begin_inset LatexCommand \ref{filter}
\end_inset
for more informations on the various modes.
\layout Description
--dont-backup don't create a backup file.
\layout Description
--sug-mode=<> the suggestion mode to use where mode is one of ultra,
fast, normal, or bad-spellers.
See section
\begin_inset LatexCommand \ref{suggestion}
\end_inset
for more information on these modes.
\layout Description
--master=<> the main dictionary to use.
The default aspell installation provided the following dictionaries: american,
british, and canadian.
\layout Standard
Please see Chapter
\begin_inset LatexCommand \ref{customizing}
\end_inset
for more information on the available options.
\layout Standard
For example to check the file foo.txt:
\layout Quote
aspell check foo.txt
\layout Standard
and to check the file foo.txt using the bad-spellers suggestion mode and
the large American English dictionary:
\layout Quote
aspell check --sug-mode=bad-spellers --master=american-lrg foo.txt
\layout Standard
If the
\series bold
mode
\series default
option is not given then aspell will use the extension of the file to determine
the current mode.
If the extension is .tex, then TeX mode will be uses, if the extension is
.html, .htm, .php, or .sgml it will check the file in sgml mode, otherwise
it will use url mode.
Please note that the
\series bold
sgml-options
\series default
can be used to change what extension are used for the sgml mode.
See chapter
\begin_inset LatexCommand \ref{filter}
\end_inset
for more information on the various modes that can be used.
\layout Standard
If Aspell was compiled with curses support and the TERM environmental variable
is set to a capable terminal type than Aspell will use a nice full screen
interface.
Otherwise it will use a simpler
\begin_inset Quotes eld
\end_inset
dumb
\begin_inset Quotes erd
\end_inset
terminal interface where the misspelled word is surrounded by two '*'.
In either case the interface should be self explanatory.
\layout Section
Using Aspell with other Applications
\layout Subsection
With Applications that Expect Ispell
\layout Standard
Aspell can be used as a drop in replacement for Ispell for programs that
use Ispell through a pipe such as Emacs and LyX.
It can also be used with programs that use simple call the ispell command
and expect the original file to be overwritten with the corrected version.
It support the basic features of Ispell, however it does not currently
have a Nroff mode so there may be situations in which you still wish to
use Ispell.
Nevertheless, I have been using Aspell for Xemacs and LyX since the middle
of September of 1998 with out any problems.
\layout Standard
If you do not have Ispell installed on your system and have installed the
Ispell compatibly script than you should not need to do anything as most
applications will that expect Ispell will work as expected with Aspell
via the Ispell compatibility script.
\layout Standard
Otherwise, the recommended way to use Aspell as a replacement for ispell
is to change the Ispell command from within the program being used.
If the program uses ispell in pipe mode simple change ispell to aspell.
If the program calls the
\series bold
\series default
ispell command to check the file change
\begin_inset Quotes eld
\end_inset
ispell
\begin_inset Quotes erd
\end_inset
\series bold
\series default
with
\begin_inset Quotes eld
\end_inset
aspell check
\begin_inset Quotes erd
\end_inset
.
\layout Standard
If that is impossible and the program uses ispell through a pipe than the
run-with-aspell script can be used for programs using ispell in pipe mode.
The format of the script is:
\layout Quote
run-with-aspell <>
\layout Standard
where <> is the name of the program with any optional arguments.
\layout Standard
The old method of mapping Ispell to Aspell is discouraged because it can
create compatibility problems with programs that actually require Ispell
such as Ispell's own scripts.
\layout Subsection
With Emacs and Xemacs
\layout Standard
The easiest way to use Aspell with Emacs or Xemacs is to add this line:
\layout LyX-Code
(setq-default ispell-program-name "aspell")
\layout Standard
to the end of your .emacs file.
\layout Standard
For some reason version 3.0 of ispell.el (the lisp program that (x)emacs uses)
want to reverse the suggestion list.
To fix this add this line:
\layout LyX-Code
(setq-default ispell-extra-args '("--reverse"))
\layout Standard
after the previous line in your .emacs file and it should solve the problem.
\layout Standard
Ispell.el, version 3.1 (December 1, 1998) and better, has the list reversing
problem fixed.
You can find it at
\begin_inset LatexCommand \url{http://www.kdstevens.com/~stevens/ispell-page.html}
\end_inset
.
\layout Subsection
With LyX
\layout Standard
Version 1.0 of LyX provides support for Aspell learning for users mistake
feature.
\layout Standard
To use aspell with LyX 1.0 either change the
\series bold
spell_command
\series default
option in the lyxrc file or use the run-with-aspell utility.
\layout Subsection
With VIM
\layout Standard
\shape italic
(The following section was written by "R.
Marc", rmarc at copacetic net.)
\layout Standard
To use aspell in vim you simply need to add the following line to your .vimrc
file:
\layout LyX-Code
map ^T :w!:!aspell check %:e! %
\layout Standard
I use since that's the way you spell check in pico.
In order to add a control character to your .vimrc you must type
first.
In this case .
\layout Standard
A more useful way to use Aspell, IMHO, is in combination with newsbody (
\begin_inset LatexCommand \url{http://www.image.dk/~byrial/newsbody/}
\end_inset
) which is how I use it since vim is my editor for my mailer and my news
reader.
\layout LyX-Code
map ^T
\backslash
1
\backslash
2:e! %
\layout LyX-Code
map
\backslash
1 :w!
\layout LyX-Code
map
\backslash
2 :!newsbody -qs -n % -p aspell check
\backslash
%f
\layout Subsection
With Pine
\layout Standard
To use aspell in pine simply change the option
\series bold
speller
\series default
to
\layout Quote
aspell --mode=email check
\layout Standard
To change the speller option go to the main menu.
Type
\series bold
S
\series default
for
\emph on
setup
\emph default
,
\series bold
C
\series default
for
\emph on
config
\emph default
, then W for
\emph on
where is
\emph default
.
Type in
\series bold
speller
\series default
as the word to find.
The speller option should be highlighted now.
Hit enter, type in the above line, and hit enter again.
Then type
\series bold
E
\series default
for
\emph on
exit setup
\emph default
and
\series bold
Y
\series default
to save the change.
\layout Standard
If you have a strong desire to check other peoples comments change
\series bold
speller
\series default
to
\layout Quote
aspell check
\layout Standard
instead which will avoid switching aspell into email mode.
\layout Chapter
Managing Word Lists
\begin_inset LatexCommand \label{manage}
\end_inset
\layout Section
Creating an Individual Word List
\layout Standard
To create an individual main word list from a list of words use the command
\layout Quote
aspell --lang=<> create master ./< > < <>
\layout Standard
where < > is the name of the word list and <> is the list
of words separated by white space.
The name of the word list will automatically be converted to all lowercase.
The
\begin_inset Quotes eld
\end_inset
./
\begin_inset Quotes erd
\end_inset
is important because without it aspell will create the word list in the
normal word list directory.
If you are trying to create a word list in a language other than english
check the aspell data-dir (usually /usr/share/aspell, use
\begin_inset Quotes eld
\end_inset
aspell dump config
\begin_inset Quotes erd
\end_inset
to find out what it is on your system) to see if a language data file exists
for your language.
If not you will need to create one.
See chapter
\begin_inset LatexCommand \ref{inter}
\end_inset
for more information on using Aspell with other languages.
\layout Standard
This will create the file < > in the current directory.
To use the new word list copy the file to the normal word list directory
(use
\begin_inset Quotes eld
\end_inset
aspell config
\begin_inset Quotes erd
\end_inset
to find out what it is) and use the option --master=< >.
\layout Standard
The compiled dictionary file is machine dependent.
It is dependent on endian order, and the page size for the machine because
they are mmaped in.
Please do not distribute the compiled dictionaries unless you are only
distributing them for a particular platform such as you would a binary.
That is why is normally installed in
\begin_inset Quotes eld
\end_inset
lib/aspell
\begin_inset Quotes erd
\end_inset
instead of
\begin_inset Quotes eld
\end_inset
share/aspell
\begin_inset Quotes erd
\end_inset
.
\layout Standard
Aspell is now also able to use special
\begin_inset Quotes eld
\end_inset
multi
\begin_inset Quotes erd
\end_inset
dictionaries.
See section
\begin_inset LatexCommand \ref{dict-opts}
\end_inset
form more information.
\layout Standard
A personal and replacement word list can be created in a similar fashion.
\layout Standard
Because Aspell does not support any sort of affix compression like Ispell
does Ispell word lists will not work as is.
In order to use Ispell's word lists simply pipe the word list through ``ispell
-e'' to expand the munched word lists.
\layout Subsection
Format of the Replacement Word List
\layout Standard
The replacement word has each replacement pair on its own line in the following
format
\layout LyX-Code
<>: <>
\layout LyX-Code
\layout Section
The PWLI file
\layout Standard
In order for Aspell to be able to correctly recognize a dictionary based
on the setting of the LANG environmental variable and for Pspell to be
able to find your word lists each main word list installed should have
at least one PWLI file associated with it in the Pspell data directory.
This is normally /usr/local/share/pspell/.
You can use
\begin_inset Quotes eld
\end_inset
pspell-config pkgdatadir
\begin_inset Quotes erd
\end_inset
to find out what it is on your system.
\layout Standard
Each PWLI has the the following name:
\layout Quote
<>[-[<>][-<>]]-<>.pwli
\layout Standard
Where <> is the two letter language code, <> is the
particular spelling your interested in if the languages has multiple spelling
in different parts of the world such as English, <> is any extra
informations to distinguish the word list from other ones with the same
language and spelling, and <> is the pspell module the main word
list is for.
\layout Standard
For example:
\layout Quote
en-aspell.pwli
\newline
en-american-aspell.pwli
\newline
en-american-medical-ispell.pwli
\newline
en-american-xlg-ispell.pwli
\newline
de--medical-ispell.pwli
\layout Standard
Notice how if the spelling is left out but the jargon is not there needs
to be two dashes between the language and the jargon.
\layout Standard
Each PWLI file for an Aspell word list should then contain exactly one line
which contains the full path of the main word list.
\layout Section
Dumping the contents of the word list
\layout Standard
The dump command will simply dump the contents of a word list to stdout
in a format than can be read back in with
\series bold
aspell create
\series default
.
\layout Standard
If no word list is specified the command will act on the default one.
For example the command
\layout Quote
aspell dump personal
\layout Standard
will simply dump the contents of the current personal word list to stdout.
\layout Chapter
Customizing Aspell
\begin_inset LatexCommand \label{customizing}
\end_inset
\layout Standard
The behavior of Aspell can be changed by any number of options which can
be specified at either the command line, the environmental variable ASPELL_CONF
, a personal configuration file, or a global configuration file.
Options specified on the command line override options specified by the
environmental variable.
Options specified by the environmental variable override options specified
by either of the configurations files.
Finally options specified by the personal configuration file override options
specified in the global configuration file.
Options specified in the environmental variable ASPELL_CONF, a personal
configuration file, or a global configuration file will take effect no
matter how Aspell is used which includes being used by other applications.
\layout Standard
Aspell has three basic type of options:
\series bold
boolean
\series default
,
\series bold
value
\series default
, and
\series bold
list
\series default
.
\series bold
Boolean
\series default
options are either enabled or disabled,
\series bold
value
\series default
options take a specific value, and
\series bold
list
\series default
options can either have entries added or removed from the list.
\layout Section
Specifying Options
\layout Subsection
At the Command Line
\layout Standard
All options specified at the command line have the following basic format:
\layout Quote
--<>[=<>]
\layout Standard
where the '=' can be replaced by whitespace.
\layout Standard
However some options also have single letter abbreviations of the form:
\layout Quote
-<>[<><>]
\layout Subsubsection
Boolean
\layout Standard
To enable a boolean option simply special the option with out any corresponding
value.
For example to ignore accents when checking words use
\begin_inset Quotes eld
\end_inset
--ignore-accents
\begin_inset Quotes erd
\end_inset
.
To disable a boolean option prefix the option name with a
\begin_inset Quotes eld
\end_inset
dont-
\begin_inset Quotes erd
\end_inset
.
For example to not ignore accents when checking words use
\begin_inset Quotes eld
\end_inset
--dont-ignore-accents
\begin_inset Quotes erd
\end_inset
.
\layout Standard
If a boolean option has a single letter abbreviation simply give the letter
corresponding to either enabling or disabling the option with out any correspon
ding value.
For example to consider run-together words legal use
\begin_inset Quotes eld
\end_inset
-C
\begin_inset Quotes erd
\end_inset
or to consider them illegal use
\begin_inset Quotes eld
\end_inset
-B
\begin_inset Quotes erd
\end_inset
\layout Subsubsection
Value
\layout Standard
To specify a value option simply specify the option with its corresponding
value.
For example to set the filter mode to Tex use
\begin_inset Quotes eld
\end_inset
--mode=tex
\begin_inset Quotes erd
\end_inset
.
\layout Standard
If a value option has a single letter shortcut simply specify the single
letter short cut with its corresponding value.
For example to use a large american dictionary use
\begin_inset Quotes eld
\end_inset
-d american-lrg
\begin_inset Quotes erd
\end_inset
.
\layout Subsubsection
List
\layout Standard
To add a value to the list prefix the option name with a
\begin_inset Quotes eld
\end_inset
add-
\begin_inset Quotes erd
\end_inset
and then specify the value to add.
For example to add the URL filter use
\begin_inset Quotes eld
\end_inset
--add-filter url
\begin_inset Quotes erd
\end_inset
.
To remove a value from a list option prefix the option name with a
\begin_inset Quotes eld
\end_inset
rem-
\begin_inset Quotes erd
\end_inset
and then specify the value to remove.
For example to remove the URL filter use
\begin_inset Quotes eld
\end_inset
--rem-filter url
\begin_inset Quotes erd
\end_inset
.
To remove all items from a list prefix the option name with a
\begin_inset Quotes eld
\end_inset
rem-all
\begin_inset Quotes erd
\end_inset
without specify any value.
For example to remove all filters use
\begin_inset Quotes eld
\end_inset
--rem-all-filter
\begin_inset Quotes erd
\end_inset
.
\layout Subsection
Via a Configuration File
\layout Standard
Aspell can also accept options via a personal or global configuration file.
The exact files to used are specified by the options
\series bold
per-conf
\series default
and
\series bold
conf
\series default
respectfully but the personal configuration file is normally
\begin_inset Quotes eld
\end_inset
.aspell.conf
\begin_inset Quotes erd
\end_inset
located in the HOME directory and the global one is normally
\begin_inset Quotes eld
\end_inset
aspell.conf
\begin_inset Quotes erd
\end_inset
which is located in the etc directory which is normally
\begin_inset Quotes eld
\end_inset
/usr/etc
\begin_inset Quotes erd
\end_inset
or
\begin_inset Quotes eld
\end_inset
/usr/local/etc
\begin_inset Quotes erd
\end_inset
.
To find out the particular values for your particular system use
\begin_inset Quotes eld
\end_inset
aspell dump config
\begin_inset Quotes erd
\end_inset
.
\layout Standard
Each line of the configuration file has the format:
\layout LyX-Code
<> [<>]
\layout Standard
There may any number of spaces between the option and the value however
it can only be spaces, ie there is no '=' between the option name and the
value.
\layout Standard
Comments may also be included by preceding them with a '#' as anything from
a
\begin_inset Quotes eld
\end_inset
#
\begin_inset Quotes erd
\end_inset
to a newline is ignored.
Blank lines are also allowed.
\layout Standard
Values set in the personal configuration file override those in the global
file.
Options specified at either the command line or via an environmental variable
override those specified by either configuration file.
\layout Subsubsection
Boolean
\layout Standard
To specify a boolean option simply include the option followed by a
\begin_inset Quotes eld
\end_inset
true
\begin_inset Quotes erd
\end_inset
to enable it or a
\begin_inset Quotes eld
\end_inset
false
\begin_inset Quotes erd
\end_inset
to disable it.
For example to allow run-together words use
\begin_inset Quotes eld
\end_inset
run-together true
\begin_inset Quotes erd
\end_inset
.
\layout Subsubsection
Value
\layout Standard
To specify a value option simply include the option followed by the correspondin
g option.
For example to set the default language to german use
\begin_inset Quotes eld
\end_inset
lang german
\begin_inset Quotes erd
\end_inset
.
\layout Subsubsection
List
\layout Standard
To add a value to the list prefix the option name with a
\begin_inset Quotes eld
\end_inset
add-
\begin_inset Quotes erd
\end_inset
and then specify the value to add.
For example to add the URL filter use
\begin_inset Quotes eld
\end_inset
add-filter url
\begin_inset Quotes erd
\end_inset
.
To remove a value from a list option prefix the option name with a
\begin_inset Quotes eld
\end_inset
rem-
\begin_inset Quotes erd
\end_inset
and then specify the value to remove.
For example to remove the URL filter use
\begin_inset Quotes eld
\end_inset
rem-filter url
\begin_inset Quotes erd
\end_inset
.
To remove all items from a list prefix the option name with a
\begin_inset Quotes eld
\end_inset
rem-all
\begin_inset Quotes erd
\end_inset
without specify any value.
For example to remove all filters use
\begin_inset Quotes eld
\end_inset
rem-all-filter
\begin_inset Quotes erd
\end_inset
.
\layout Subsection
Via an Environmental Variable
\layout Standard
The environmental variable ASPELL_CONF may also be used and it overrides
any options set in the configuration file.
The format of the string is exactly the same as the configuration file
except that semicolons ( ; ) are used instead of newlines.
\layout Section
The Options
\layout Standard
The following is a list of available options broken down by category.
Each entry has the following format:
\layout LaTeX
\backslash
begin{quote}
\backslash
begin{description}
\newline
\backslash
item [<>[,<>]]
\shape italic
(<>)
\shape default
<>
\newline
\backslash
end{description}
\backslash
end{quote}
\layout Standard
Where single letter options are specified as they would appear at the command
line, ie with the preceding dash.
Boolean single letter options are specified in the following format:
\layout Quote
-<>|-<>
\layout Standard
<> is one of the following:
\series bold
boolean
\series default
,
\series bold
string
\series default
,
\series bold
file
\series default
,
\series bold
dir
\series default
,
\series bold
integer
\series default
, or
\series bold
list
\series default
.
\series bold
String
\series default
,
\series bold
file
\series default
,
\series bold
dir
\series default
, and
\series bold
integer
\series default
types are all value options which can only take a specific type of value.
\layout Subsection
Basic Options
\layout Description
conf
\shape italic
(file)
\shape default
main configuration file
\layout Description
conf-dir
\shape italic
(dir)
\shape default
location of main configuration file
\layout Description
data-dir
\shape italic
(dir
\shape default
) location of language data files
\layout Description
local-data-dir
\shape italic
(dir)
\shape default
alternative location of language data files.
This directory is searched before
\series bold
data-dir
\series default
.
It defaults to the same directory the actual main word list is in (which
is not necessarily dict-dir).
\layout Description
filter
\shape italic
(list)
\shape default
add or removes a filter
\layout Description
home-dir (
\shape italic
dir
\shape default
) location for personal files
\layout Description
ignore,-W (
\shape italic
integer
\shape default
) ignore words <= n chars
\layout Description
ignore-case
\shape italic
(boolean)
\shape default
ignore case when checking words
\layout Description
ignore-accents
\shape italic
(boolean)
\shape default
ignore accents when checking words
\layout Description
ignore-repl
\shape italic
(boolean)
\shape default
ignore commands to store replacement pairs
\layout Description
save-repl
\shape italic
(boolean)
\shape default
save the replacement word list on save all
\layout Description
lang
\shape italic
(string)
\shape default
default language to use when creating a dictionary or all else fails
\layout Description
language-tag
\shape italic
(string)
\shape default
language code to use when selecting a dictionary, it follows the same format
of the LANG environmental variable on most systems.
In fact it defaults to the value of the LANG environmental variable if
it is set.
\layout Description
mode
\shape italic
(string)
\shape default
sets the filter mode.
Mode is one if none, url, email, sgml, or tex.
(The short cut options '-e' may be used for email, '-H' for Html/Sgml,
or '-t' for Tex)
\layout Description
per-conf
\shape italic
(file)
\shape default
personal configuration file
\layout Description
personal,-p
\shape italic
(file)
\shape default
personal word list file name
\layout Description
prefix
\shape italic
(dir)
\shape default
prefix directory
\layout Description
set-prefix
\shape italic
(boolean)
\shape default
set the prefix based on executable location (only works on Win32 and when
compiled with --enable-win32-relocatable)
\layout Description
repl
\shape italic
(file)
\shape default
replacements list file name
\layout Description
keyboard
\shape italic
(file)
\shape default
the base name of the keyboard definition file to use (see section
\begin_inset LatexCommand \ref{typo}
\end_inset
)
\layout Description
sug-mode
\shape italic
(mode)
\shape default
suggestion mode = ultra | fast | normal | bad-spellers (see section
\begin_inset LatexCommand \ref{suggestion}
\end_inset
)
\layout Subsection
Dictionary Options
\layout Standard
The following options may be used to control which dictionaries to use and
how they behave (see section
\begin_inset LatexCommand \ref{dict-opts}
\end_inset
for more information):
\layout Description
master,-d
\shape italic
(string)
\shape default
base name of the main dictionary to use.
The default Aspell installation provided the following dictionaries: american,
british, and canadian.
\layout Description
dict-dir
\shape italic
(dir)
\shape default
location of the main word list
\layout Description
extra-dicts
\shape italic
(list)
\shape default
extra dictionaries to use
\layout Description
strip-accents
\shape italic
(boolean)
\shape default
strip accents from all words in the dictionary
\layout Subsection
Run-together Word Options
\layout Standard
These may be used to control the behavior of run-together words (see section
\begin_inset LatexCommand \ref{run-together}
\end_inset
for more information):
\layout Description
run-together,-C|-B
\shape italic
(boolean)
\shape default
consider run-together words legal
\layout Description
run-together-limit
\shape italic
(integer)
\shape default
maximum numbers that can be strung together
\layout Description
run-together-min
\shape italic
(integer)
\shape default
minimal length of interior words
\layout Subsection
Filter Options
\layout Standard
These options modify the behavior of the various filters (see section
\begin_inset LatexCommand \ref{filter}
\end_inset
for more information):
\layout Description
add|rem-email-quote
\shape italic
(list)
\shape default
email quote characters
\layout Description
email-margin
\shape italic
(integer)
\shape default
num chars that can appear before the quote char
\layout Description
sgml-check
\shape italic
(list)
\shape default
sgml tags to always check.
\layout Description
sgml-extension
\shape italic
(list)
\shape default
sgml file extensions.
\layout Description
tex-command
\shape italic
(list)
\shape default
TeX commands
\layout Description
tex-check-comments
\shape italic
(boolean)
\shape default
check TeX comments
\layout Subsection
Aspell Utility Options
\layout Standard
These options are may only be specified at the command line as there are
aspell utility specific:
\layout Description
backup,-b|-x
\shape italic
(boolean)
\shape default
create a backup file by appending
\begin_inset Quotes eld
\end_inset
.bak
\begin_inset Quotes erd
\end_inset
to the file name.
(Only applies when the command is
\series bold
check
\series default
)
\layout Description
time
\shape italic
(boolean)
\shape default
time load time and suggest time in pipe mode.
\layout Description
reverse
\shape italic
(boolean)
\shape default
reverse the order of the suggestions list.
\layout Section
Dumping Configuration Values
\layout Standard
To find out the current value of all the options use the command
\begin_inset Quotes eld
\end_inset
aspell dump config
\begin_inset Quotes erd
\end_inset
.
This will dump the current configuration to standard output.
The format of the contents dumped is such that it can be used as either
the global or personal configuration file.
\layout Section
Notes on various Options
\layout Subsection
Pertaining to which word lists to use
\begin_inset LatexCommand \label{dict-opts}
\end_inset
\layout Subsubsection
How Aspell Selects an Appropriate Dictionary
\layout Standard
Aspell will go through the following steps to find an appropriate dictionary:
\layout Enumerate
If the
\series bold
master
\series default
options is set in any fashion (via the command line, the ASPELL_CONF environmen
tal variable, or a configuration file) look for a dictionary of that name.
If one could not be found complain.
\layout Enumerate
If the
\series bold
language-tag
\series default
(
\emph on
not
\emph default
\series bold
lang
\series default
) option or LANG environmental variable is set and master option is not
then use it (giving preference to
\series bold
language-tag
\series default
over LANG) to search for an appropriate dictionary.
Aspell will use the same strategy that Pspell does, which is based on the
installed .pwli files, to find an appropriate word list.
For more information of how this is done see the Pspell manual.
\layout Enumerate
If 2 fails than look for a dictionary of the same name of current setting
of the
\series bold
lang
\series default
options.
This will currently work even if the language name is invalid, but this
fact should not be relied upon as it is an implementation detail.
\layout Enumerate
Finally, if all else fails, complain.
\layout Subsubsection
About Multi Dictionaries
\layout Standard
As with precious versions of aspell you can specify the main dictionary
to use via the -d or --master option.
However as of Aspell .32 you can now also:
\layout Enumerate
Specify more than word list to use with the
\series bold
extra-dicts
\series default
option.
\layout Enumerate
Optionally have all accents striped form the word lists using
\series bold
strip-accents
\series default
option.
This is
\emph on
not
\emph default
the same thing as the
\series bold
ignore-accents
\series default
option.
As enabling the
\series bold
ignore-accents
\series default
would accept both cafe and café (notice the accent on the e), but only
enabling strip-accents would only accent cafe, even if café is in the original
dictionary.
Specify
\series bold
strip-accents
\series default
is just like using a word list with out the accents.
\layout Enumerate
Specify special
\begin_inset Quotes eld
\end_inset
multi
\begin_inset Quotes erd
\end_inset
dictionaries.
\layout Standard
A
\begin_inset Quotes eld
\end_inset
multi
\begin_inset Quotes erd
\end_inset
dictionary is a special file which basically a list of dictionary files
to use.
A multi dictionary must end is
\series bold
.multi
\series default
and has roughly the same format of a configuration file where the two valid
keys are
\series bold
add
\series default
and
\series bold
strip-accents
\series default
.
The
\series bold
add
\series default
key is used for adding individual word lists, or other
\begin_inset Quotes eld
\end_inset
multi
\begin_inset Quotes erd
\end_inset
files.
The
\series bold
strip-accents
\series default
key is used to control if accents are striped from the dictionaries.
Unlike the global strip-accent option this option only effects word lists
that came after the option.
For example:
\layout Quote
strip-accents yes
\newline
add english
\newline
strip-accents no
\newline
add must-accent
\layout Standard
will strip accents from the english word list but not the must-accent word
list.
If the global strip-accents option is specified the local strip-accents
options are ignored.
\layout Subsubsection
Provided Word Lists
\layout Standard
Aspell now provides multi dictionaries for three variates of english:
\series bold
american-med
\series default
,
\series bold
british-med
\series default
, and
\series bold
canadian-med
\series default
.
The word lists themselves all contain accented words however the
\series bold
strip-accents
\series default
option is enabled by default for all the individual word lists.
If you wish to use the accented words you can set the global
\series bold
strip-accents
\series default
option to false or create a new multi word list.
\layout Standard
Great care has been taken so that that only one spelling for any particular
word is included in the main list.
When two variants were considered equal I randomly picked one for inclusion
in the main word list.
Unfortunately this means that my choice in how to spell a word may not
match your choice.
If this is the case you can try to include one of the special variant dictionar
ies with the
\series bold
add-extra-dict
\series default
option.
You can chose from english-variant-0, english-variant-1, and english-variant-2.
Each of these word lists included all the others from the previous variant
level, thus there is no need to include more than one.
English-variant-0 includes most variants which are considered almost equal,
english-variant-1 include variants which are also generally considered
acceptable, and english-variant-2 contains variants which are seldom used.
These special variant dictionaries are an experimental feature so please
let me know if you take advantage of them.
If no one seams to be using them I may no longer provide them in a future
release of Aspell.
\layout Standard
Many other dictionary sizes and varieties can be created.
See the scowl/ directory in the source distribution for information on
the different varieties you can create and section
\begin_inset LatexCommand \ref{manage}
\end_inset
for how to create an individual dictionary.
\layout Subsection
Notes on Various Filters and Filter Modes
\begin_inset LatexCommand \label{filter}
\end_inset
\layout Standard
Aspell now has rudimentary filter support.
You can either select from individual filters or chose a filter mode.
To select a filter mode use the
\series bold
mode
\series default
option.
You may chose from
\series bold
none
\series default
,
\series bold
url
\series default
,
\series bold
email
\series default
,
\series bold
sgml
\series default
, and
\series bold
tex
\series default
.
The default mode is
\series bold
url
\series default
.
Individual filters can be added with the option
\series bold
add-filter
\series default
and remove with the
\series bold
rem-filter
\series default
option.
The currently available filters are
\series bold
url
\series default
,
\series bold
email
\series default
,
\series bold
sgml
\series default
,
\series bold
tex
\series default
as well as a bunch of filters which translate the text from one format
to another.
\layout Subsubsection
None Mode
\layout Standard
This mode is exactly what it says.
It turns off all filters.
\layout Subsubsection
Url Filter/Mode
\layout Standard
The
\series bold
url
\series default
filter/mode skips over URL's, host names, and email addresses.
Because this filter is almost always useful and rarely does any harm it
is enabled in all modes except
\series bold
none
\series default
.
To turn it off either select the
\series bold
none
\series default
mode or use
\series bold
rem-filter
\series default
option
\emph on
after
\emph default
the desired mode is selected.
\layout Subsubsection
Email Filter/Mode
\layout Standard
The
\series bold
email
\series default
filter/mode skips over quoted text.
It currently does not support skipping over headers however a future version
should.
In the mean time I suggest you use Aspell with Newsbody which can be found
at
\begin_inset LatexCommand \url{http://home.worldonline.dk/~byrial/newsbody/}
\end_inset
.
The option
\series bold
email-skip
\series default
controls the number of characters that can appear before the email quote
char, the default is 10.
The option
\series bold
add|rem-email-quote
\series default
controls the characters that are considered quote characters, the default
is
\begin_inset Quotes eld
\end_inset
>' and '|'.
\layout Subsubsection
SGML Filter/Mode
\begin_inset LatexCommand \label{sgml}
\end_inset
\layout Standard
The
\series bold
sgml
\series default
filter/mode will skip over sgml commands.
It currently does not handle nested < > unless they are in quotes.
It also does it handle the null end tag (net) minimization feature of sgml
such as
\layout Quote
>/<>
\series default
filter.
\layout Subsubsection
TeX Filter/Mode
\layout Standard
The
\series bold
tex
\series default
(all lowercase) filter/mode skips over TeX commands and parameters and/or
options to certain command.
It also skips over TeX comments by default.
The option
\series bold
[dont-]tex-check-comments
\series default
controls whether or not aspel will skip over TeX comments.
The option
\series bold
add|rem-tex-command
\series default
controls which TeX commands should have certain parameters and/or options
also skipped over.
Commands that are not specified will have all there parameters and/or options
checked.
The format for each item is
\layout Quote
<>\SpecialChar ~
\SpecialChar ~
<>
\layout Standard
The first item is simple the command name.
The second item controls which parameters to skip over.
A 'p' skips over a parameter while a 'P' won't.
Similar an 'o' will skip over an optional parameter while a 'O' won't.
The first letter on the list will apply to the first parameter, the second
letter will apply to the second parameter etc.
If there are more parameters than letters Aspell will simply check them
as normal.
For example the option
\layout Quote
add-tex-command rule pp
\layout Standard
will skip over the first two parameters of the
\begin_inset Quotes eld
\end_inset
rule
\begin_inset Quotes erd
\end_inset
command while the option
\layout Quote
add-tex-command foo Pop
\layout Standard
will
\emph on
check
\emph default
the first parameter of the
\begin_inset Quotes eld
\end_inset
foo
\begin_inset Quotes erd
\end_inset
command, skip over the next optional parameter, if it is present, and will
skip over the second parameter --- even if the optional parameter is not
present --- and will check any additional parameters.
\layout Standard
A'*' at the end of the command is simply ignored.
For example the option
\layout Quote
enlargethispage p
\layout Standard
will ignore the first parameter in both enlargethispage and enlargethispage*.
\layout Standard
To remove a command simple use the
\series bold
rem-tex-command
\series default
option.
For example
\layout Quote
rem-tex-command foo
\layout Standard
will remove the command foo, if present, from the list of TeX commands.
\layout Subsection
Notes on the Prefix Option
\layout Standard
The
\series bold
prefix
\series default
option is there to allow Aspell to easily be relocated.
Changing
\series bold
prefix
\series default
will change all directory names relative to the new prefix that are not
explicitly set.
For example if
\series bold
prefix
\series default
was
\begin_inset Quotes eld
\end_inset
/usr/local/aspell
\begin_inset Quotes erd
\end_inset
and
\series bold
dict-dir
\series default
has a default value of
\begin_inset Quotes eld
\end_inset
/usr/local/aspell/dict
\begin_inset Quotes erd
\end_inset
than changing
\series bold
prefix
\series default
to
\begin_inset Quotes eld
\end_inset
/opt/aspell
\begin_inset Quotes erd
\end_inset
will also change the default value of
\series bold
dict-dir
\series default
to
\begin_inset Quotes eld
\end_inset
/opt/aspell/dict
\begin_inset Quotes erd
\end_inset
.
Note that modifying prefix will only effect the default compiled in values
of directories.
If a directory option is explicitly given a value than changing the value
of
\series bold
prefix
\series default
has no effect on that directory option.
\layout Subsection
Notes on Typo-Analysis and the Keyboard Definition File
\begin_inset LatexCommand \label{typo}
\end_inset
\layout Standard
Aspell .33 and better will, in general, give a higher priority to certain
misspelling which are likely to be due to typos such as
\begin_inset Quotes eld
\end_inset
teh
\begin_inset Quotes erd
\end_inset
instead of
\begin_inset Quotes eld
\end_inset
the
\begin_inset Quotes erd
\end_inset
or
\begin_inset Quotes eld
\end_inset
hapoy
\begin_inset Quotes erd
\end_inset
instead of
\begin_inset Quotes eld
\end_inset
happy
\begin_inset Quotes erd
\end_inset
.
However in order to do this well Aspell needs to know the layout of the
keyboard.
The keyboard definition file simply identifies keys that are right next
to each other.
The file has an extension of .kbd and each line consists of two letters
corresponding to two keys that are right next to each other.
For example the line
\begin_inset Quotes eld
\end_inset
as
\begin_inset Quotes erd
\end_inset
will indicate that '
\family typewriter
\series bold
a
\family default
\series default
' and '
\family typewriter
\series bold
s
\family default
\series default
' are right next to each other.
If
\begin_inset Quotes eld
\end_inset
as
\begin_inset Quotes erd
\end_inset
is listed as a entry it is not necessary to list
\begin_inset Quotes eld
\end_inset
sa
\begin_inset Quotes erd
\end_inset
as an entry as that will be done automatically.
Also by
\begin_inset Quotes eld
\end_inset
right next to each other
\begin_inset Quotes erd
\end_inset
I mean to keys that are close enough together that it is easy to type one
instead of the other.
On most keyboards this means keys that are to the left or to the right
of each other and not
\emph on
keys
\emph default
that are below or above it.
\layout Standard
The default for this option is normally
\begin_inset Quotes eld
\end_inset
standard
\begin_inset Quotes erd
\end_inset
.
However the default can be changed via the language data file.
The normal default,
\begin_inset Quotes eld
\end_inset
standard
\begin_inset Quotes erd
\end_inset
, should work well for most QWERTY like keyboard layouts.
It may need minor adjusting for foreign keyboards and will need to be completel
y rewritten for a Dvorak layout.
When creating a keyboard definition file for a foreign language please
keep in mind that Aspell completely ignores accents when scoring words
so that the key '
\family typewriter
\series bold
o
\family default
\series default
' and '
\family typewriter
\series bold
ö
\family default
\series default
' will appear to be the same key to aspell even if they are in fact separate
keys on your keyboard.
\layout Subsection
Notes on the Different Suggestion Modes
\begin_inset LatexCommand \label{suggestion}
\end_inset
\layout Standard
In order to understand what these suggestion modes do, a basic understanding
of how aspell works is required.
See section
\begin_inset LatexCommand \ref{works}
\end_inset
for that.
The suggestion modes are as follows.
\layout Description
ultra This method will use the fastest method available to come up with
decent suggestions.
This currently means that it will look for soundslikes within one edit
distance apart without doing any typo analysis.
It is slower than Ispell by a factor of 1.5 to 2 when a single word list
is used.
It speed is only minor affected by the size of the word list, if at all,
but it is strongly effected by the number of word lists use.
In this mode Aspell gets about 87% of the words from my small test kernel
of misspelled words.
(Go to
\begin_inset LatexCommand \url{http://aspell.sourceforge.net/test}
\end_inset
for more info on the test kernel as well as comparisons of this version
of Aspell with previous versions and other spell checkers.)
\layout Description
fast This method is like ultra except that it also performs typo analysis
unless it is turned off by setting the keyboard to none.
The typo analysis brings words which are likely to be due to typos to the
beginning of the list but slows things down by a factor of about two.
This mode should get around the same number of words that the ultra method
does.
\layout Description
normal This method looks for soundslikes within two edit distance apart
and perform typo-analysis unless it is turned off.
Is is around 10 times slower than fast mode with the english word list
but returns better suggestions.
Its speed is directly proportional to the size of the word list.
This mode gets 93% of the words.
\layout Description
bad-spellers This method also looks for soundslikes within two edit distances
apart but is more tailored for the bad speller where as fast or normal
are more tailed to strike a good balance between typos and true misspellings.
This mode never performs typo-analysis and returns a
\emph on
huge
\emph default
number of words for the really bad spellers who can't seam to get the spelling
anything close to what it should be.
If the misspelled word looks anything like the correct spelling it is bound
to be found
\emph on
somewhere
\emph default
on the list of 100 or more suggestions.
This mode gets 98% of the words.
\layout Chapter
Writing programs to use Aspell
\layout Standard
There are two main ways to use aspell from within your application.
Through the Pspell API or though a pipe.
The Aspell API can be used directly but that is not recommended as the
actual Aspell API is constantly changing.
\layout Section
Though the Pspell API
\layout Standard
To use Aspell through the Pspell API please see the Pspell manual.
\layout Subsection
Notes About Thread Safety
\layout Standard
Read-only Aspell methods and functions should be thread safe as long as
exceptions, new, delete, delete[], and STL allocators are thread safe.
To the best of my knowledge gcc and egcs meet these requirements.
It is up to the programmer to make sure multiple threads do not do thing
such as change the dictionaries and add or delete items from the personal
or session dictionaries.
\layout Section
Through A Pipe
\begin_inset LatexCommand \label{pipe}
\end_inset
\layout Standard
When given the
\series bold
pipe
\series default
or
\series bold
-a
\series default
command aspell goes into a pipe mode that is compatible with
\begin_inset Quotes eld
\end_inset
ispell -a
\begin_inset Quotes erd
\end_inset
.
Aspell also defines its own set of extensions to ispell pipe mode.
\layout Subsection
Format of the Data Stream
\begin_inset LatexCommand \label{data_stream}
\end_inset
\layout Standard
In this mode, Aspell prints a one-line version identification message, and
then begins reading lines of input.
For each input line, a single line is written to the standard output for
each word checked for spelling on the line.
If the word was found in the main dictionary, or your personal dictionary,
then the line contains only a '*'.
\layout Standard
If the word is not in the dictionary, but there are suggestions, then the
line contains an '&', a space, the misspelled word, a space, the number
of near misses, the number of characters between the beginning of the line
and the beginning of the misspelled word, a colon, another space, and a
list of the suggestions separated by commas and spaces.
\layout Standard
Finally, if the word does not appear in the dictionary, and there are no
suggestions, then the line contains a '#', a space, the misspelled word,
a space, and the character offset from the beginning of the line.
Each sentence of text input is terminated with an additional blank line,
indicating that ispell has completed processing the input line.
\layout Standard
These output lines can be summarized as follows:
\layout Description
OK: *
\layout Description
Suggestions: & <> <> <>: <>, <>, ...
\layout Description
None: # <> <>
\layout Standard
When in the -a mode, Aspell will also accept lines of single words prefixed
with any of '*', '&', '@', '+', '-', '~', '#', '!', '%', or '^'.
A line starting with '*' tells ispell to insert the word into the user's
dictionary.
A line starting with '&' tells ispell to insert an all-lowercase version
of the word into the user's dictionary.
A line starting with '@' causes ispell to accept this word in the future.
A line starting with '+', followed immediately by a valid mode will cause
aspell to parse future input according the syntax of that formatter.
A line consisting solely of a '+' will place ispell in TeX/LaTeX mode (similar
to the -t option) and '-' returns aspell to its default mode (but these
commands are obsolete).
A line '~', is ignored for ispell compatibility.
A line prefixed with '#' will cause the personal dictionaries to be saved.
A line prefixed with '!' will turn on terse mode (see below), and a line
prefixed with '%' will return ispell to normal (non-terse) mode.
Any input following the prefix characters '+', '-', '#', '!', '~', or '%'
is ignored, as is any input following.
To allow spell-checking of lines beginning with these characters, a line
starting with '^' has that character removed before it is passed to the
spell-checking code.
It is recommended that programmatic interfaces prefix every data line with
an uparrow to protect themselves against future changes in Aspell.
\layout Standard
To summarize these:
\layout Description
*<> Add a word to the personal dictionary
\layout Description
&<> Insert the all-lowercase version of the word in the personal dictionar
y
\layout Description
@<> Accept the word, but leave it out of the dictionary
\layout Description
# Save the current personal dictionary
\layout Description
~ Ignored for ispell compatibility.
\layout Description
+ Enter TeX mode.
\layout Description
+<> Enter the mode specified by <>.
\layout Description
- Enter the default mode.
\layout Description
! Enter terse mode
\layout Description
% Exit terse mode
\layout Description
^ Spell-check the rest of the line
\layout Standard
In terse mode, Aspell will not print lines beginning with '*', which indicate
correct words.
This significantly improves running speed when the driving program is going
to ignore correct words anyway.
\layout Standard
In addition to the above commands which are designed for Ispell compatibility
Aspell also supports its own extension.
All Aspell extensions follow the following format.
\layout Quote
$$<> [data]
\layout Standard
Where data may or may not be required depending on the particular command.
Aspell currently supports the following command.
\layout Description
cs\SpecialChar ~
<>,<> Change a configuration option.
\layout Description
cr\SpecialChar ~
<> Prints the value of a configuration option.
\layout Description
s\SpecialChar ~
<>,<> Returns the score of the two words based roughly on
how aspell would score them.
\layout Description
Sw\SpecialChar ~
<> Returns the soundlike equivalent of the word.
\layout Description
Sl\SpecialChar ~
<> Returns a list of words that have the same soundlike equivalent.
\layout Description
Pw\SpecialChar ~
<> Returns the phoneme equivalent of the word.
\layout Description
pp Returns a list of all words in the current personal wordlist.
\layout Description
ps Returns a list of all words in the current session dictionary.
\layout Description
l Returns the current language name.
\layout Description
ra\SpecialChar ~
<>,<> Add the word pair to the replacement dictionary for latter
use.
Returns nothing.
\layout Standard
Anything returned is returned on its own line line.
All lists returned have the following format
\layout Quote
<>: <>, <>, <>
\layout Standard
\emph on
(Part of the preceding section was directly copied out of the Ispell manual)
\layout Section
Notes of Storing Replacement Pairs
\begin_inset LatexCommand \label{replpair}
\end_inset
\layout Standard
As of version .27 of Aspell storing replacements pairs has a memory.
Which means if you first store the replacement pair:
\layout Quote
sicolagest -> psycolagest
\layout Standard
then store the replacement pair
\layout Quote
psycolagest -> psychologist
\layout Standard
The replacement pair
\layout Quote
sicolagest -> psychologist
\layout Standard
will also get stored so that you don't have to worry about it.
\layout Chapter
Adding Support For Other Languages
\begin_inset LatexCommand \label{inter}
\end_inset
\layout Standard
Before you consider adding support for Aspell for your language first check
if it is already done via the Aspell-Dicts project.
As of the release of support for the following languages is now available:
\layout Quotation
\begin_inset Tabular
\begin_inset Text
\layout Standard
Breton
\end_inset
|
\begin_inset Text
\layout Standard
[br]
\end_inset
|
\begin_inset Text
\layout Standard
\end_inset
|
\begin_inset Text
\layout Standard
Catalan
\end_inset
|
\begin_inset Text
\layout Standard
[ca]
\end_inset
|
\begin_inset Text
\layout Standard
\end_inset
|
\begin_inset Text
\layout Standard
Czech
\end_inset
|
\begin_inset Text
\layout Standard
[cs]
\end_inset
|
\begin_inset Text
\layout Standard
\end_inset
|
\begin_inset Text
\layout Standard
Danish
\end_inset
|
\begin_inset Text
\layout Standard
[da]
\end_inset
|
\begin_inset Text
\layout Standard
\end_inset
|
\begin_inset Text
\layout Standard
Dutch
\end_inset
|
\begin_inset Text
\layout Standard
[nl]
\end_inset
|
\begin_inset Text
\layout Standard
\end_inset
|
\begin_inset Text
\layout Standard
Esperanto
\end_inset
|
\begin_inset Text
\layout Standard
[eo]
\end_inset
|
\begin_inset Text
\layout Standard
\end_inset
|
\begin_inset Text
\layout Standard
Faroese
\end_inset
|
\begin_inset Text
\layout Standard
[fo]
\end_inset
|
\begin_inset Text
\layout Standard
\end_inset
|
\begin_inset Text
\layout Standard
French
\end_inset
|
\begin_inset Text
\layout Standard
[fr]
\end_inset
|
\begin_inset Text
\layout Standard
(Standard [fr_FR] and Swiss French [fr_CH] and in three sizes: small, medium
and large)
\end_inset
|
\begin_inset Text
\layout Standard
German
\end_inset
|
\begin_inset Text
\layout Standard
[de]
\end_inset
|
\begin_inset Text
\layout Standard
(Standard [de_DE] and Swiss German [de_CH])
\end_inset
|
\begin_inset Text
\layout Standard
Italian
\end_inset
|
\begin_inset Text
\layout Standard
[it]
\end_inset
|
\begin_inset Text
\layout Standard
\end_inset
|
\begin_inset Text
\layout Standard
Norwegian
\end_inset
|
\begin_inset Text
\layout Standard
[no]
\end_inset
|
\begin_inset Text
\layout Standard
\end_inset
|
\begin_inset Text
\layout Standard
Polish
\end_inset
|
\begin_inset Text
\layout Standard
[pl]
\end_inset
|
\begin_inset Text
\layout Standard
\end_inset
|
\begin_inset Text
\layout Standard
Portuguese
\end_inset
|
\begin_inset Text
\layout Standard
[pt]
\end_inset
|
\begin_inset Text
\layout Standard
(Standard [pt_PT] and Brazilian [pt_BR])
\end_inset
|
\begin_inset Text
\layout Standard
Russian
\end_inset
|
\begin_inset Text
\layout Standard
[ru]
\end_inset
|
\begin_inset Text
\layout Standard
\end_inset
|
\begin_inset Text
\layout Standard
Spanish
\end_inset
|
\begin_inset Text
\layout Standard
[es]
\end_inset
|
\begin_inset Text
\layout Standard
\end_inset
|
\begin_inset Text
\layout Standard
Swedish
\end_inset
|
\begin_inset Text
\layout Standard
[sv]
\end_inset
|
\begin_inset Text
\layout Standard
\end_inset
|
\end_inset
\layout Standard
You can find all of these packages at the aspell home page (
\begin_inset LatexCommand \url{http://aspell.sourceforge.net/}
\end_inset
).
\layout Standard
If your language is not listed above please send me a note and I will work
with you on adding support.
\emph on
The instructions below still apply however for this version of Aspell however
they may not once the Aspell is merged into Pspell (see section
\begin_inset LatexCommand \ref{future}
\end_inset
).
\layout Standard
Adding a language to aspell is fairly straightforward.
You need to at very least create the language data file, and compile a
new word list.
You should also create a PWLI file for each of your word lists so that
your new word lists will work correctly with the LANG environmental variable
and with Pspell.
\layout Standard
Please note, however, that Aspell international support is not 100% done
yet.
More information on my future planes for international support in aspell
can be found at
\begin_inset LatexCommand \url{http://aspell.sourceforge.net/international/}
\end_inset
.
\layout Section
The Language Data File
\layout Standard
The basic format of the language data data is the same as it for aspell
configuration file.
It is named <>.dat and is located in the architecture independent
data dir for aspell (option
\series bold
data-dir
\series default
) which is usually <>/share/aspell.
Use
\begin_inset Quotes eld
\end_inset
aspell config
\begin_inset Quotes erd
\end_inset
to find out where it is in your installation.
\layout Standard
The language data file has several mandatory fields, and several optional
ones.
All fields are case sensitive and should be in all lower case.
\layout Standard
The two mandatory fields are
\series bold
name
\series default
and
\series bold
charset
\series default
.
\series bold
Name
\series default
is the name of the language and should be the same as the file name (without
the .dat).
\series bold
Charset
\series default
is the charset aspell will expect the word lists to be formatted in.
You may chose from any of the iso-8859-* character sets as well as, koi8-f,
koi8-r, and viscii.
If your language can fit in the plain old ASCII character set use iso8859-1.
If you use some other character set for your language other than the ones
listed here drop me a note and I will look into adding support for it.
\layout Standard
The optional fields are
\series bold
special, soundslike
\series default
,
\series bold
keyboard
\series default
and a bunch of options to specify how run-together words are handles.
\series bold
Special
\series default
is for non letter characters that can appear in your language such as the
\series bold
'
\series default
and
\series bold
-
\series default
.
The format for the value is a list separated by spaces.
Each item of the list has the following format
\layout Quote
<>\SpecialChar ~
\SpecialChar ~
<><><>
\layout Standard
<> is the non letter character in question.
<>,<>,<> are either a '-' or a '*'.
A star for <> means that the character at the beginning of the word,
a '-' means it can't.
The same is true for <> and <>.
For example the entry for the
\series bold
'
\series default
in english is:
\layout LyX-Code
' -*-
\layout Standard
To include more than one middle character just list them one after another
on the same line.
for example to make both the ' and the - a middle character use the following
line in the language data file:
\layout LyX-Code
special ' -*- - -*-
\layout Standard
The
\series bold
soundslike
\series default
option,
\series bold
\series default
if present, should be the name of the soundslike data for the language.
The data is expected to be in the file <>_phonetic.dat.
\layout Standard
If the name is
\series bold
generic
\series default
a really generic soundslike algorithm will be used which consists of striping
all the vowels and removing all accents.
I recommend first using the generic algorithm and then, after aspell is
working with the new language, work on the transformation array.
\layout Standard
If the soundslike name is
\series bold
none
\series default
then no soundslike lookup table will be used.
This will reduce the size of the compiled word list by around 50% but at
the sacrifice of suggestion quality.
If the soundslike is none than the soundslike for the word will simply
be the word itself in lowercase, will all accents stripped.
For languages with phonetic spelling the difference will not be very noticeable.
However, for languages with non-phonetic spelling there will be a noticeable
difference.
The difference you notice will depend on the quality of the soundslike
data file.
If you do not notice much of a difference for a language with non-phonetic
spelling that is a good indication that the soundslike data is not rough
enough---or the words you are trying are not that badly misspelled.
\layout Standard
The keyboard option specifies the base name of the keyboard definition file
to use.
See section
\begin_inset LatexCommand \ref{typo}
\end_inset
for more information.
\layout Standard
The options to control how run-together words are handled are the same as
the are in the normal configurations files.
Please see section
\begin_inset LatexCommand \ref{run-together}
\end_inset
for more information.
\layout Section
Compiling the Word List
\layout Standard
Once you have a working language data file installed in the right place
you are ready to compile the main word list.
See section
\begin_inset LatexCommand \ref{manage}
\end_inset
to find out what to do.
This section also includes instructions for creating the PWLI file.
\layout Section
Phonetic Code
\layout Standard
\shape italic
(The following section was written by Björn Jacke, bjoern.jacke at gmx de)
\layout Standard
Aspell is in fact the spell checker that comes up with the best suggestions
if it finds an unknown word.
One reason is that it does not just compare the word with other words in
the dictionary (like Ispell does) but also uses phonetic comparisons with
other words.
\layout Standard
The new table driven phonetic code is very flexible and setting up phonetic
transformation rules for other languages is not difficult but there can
be a number of stumbling stones --- that's why I wrote this section.
\layout Standard
The main phonetic code is free of any language specific code and should
be powerful enough to allow setting up rules for any language.
Anything which is language specific is kept in a plain text file and can
easily be edited.
So it's even possible to write phonetic transformation rules if you don't
have any programming skills.
All you need to know is how words of the language are written and how they
are pronounced.
\layout Subsection
Syntax of the transformation array
\layout Standard
In the translation array there are two strings on each line; the first one
is the search string (or switch name) and the second one is the replacement
string (or switch parameter).
The line
\layout LyX-Code
version <>
\layout Standard
is also required to appear somewhere in the translation array.
The version string can be anything but it should be changed when ever the
a new version of the translation array is released.
This is important because it will keep Aspell from using a compiled dictionary
with the wrong set of rules.
For example if when coming up with suggestion for
\begin_inset Quotes eld
\end_inset
hallo
\begin_inset Quotes erd
\end_inset
aspell will use the new rules to come up with the soundslike say
\begin_inset Quotes eld
\end_inset
H*L*
\begin_inset Quotes erd
\end_inset
but if hello is stored in the dictionary using the old rules as
\begin_inset Quotes eld
\end_inset
HL
\begin_inset Quotes erd
\end_inset
instead of
\begin_inset Quotes eld
\end_inset
H*L*
\begin_inset Quotes erd
\end_inset
aspell will never be able to come up with hello.
So to solve this problem Aspell checks if the version strings match and
abort with an error if they don't.
Thus it is important to update it when ever a new version of the translation
array is releases.
This is only a problem with the main word list as the personal word lists
are now stored as simple word lists with a single header line (ie, no soundslik
e data).
\layout Standard
Each non switch line represents one replacement (transformation) rule.
Words beginning with the same letter must be grouped together; the order
inside this group does not depend on alphabetical issues but it gives prioritie
s; the higher the rule the higher the priority.
That's why the first rule that matches is applied.
In the following example:
\layout LyX-Code
\layout LyX-Code
GH _
\newline
G K
\newline
\layout Standard
\begin_inset Quotes eld
\end_inset
GH
\latex latex
\backslash
nach
\latex default
_
\begin_inset Quotes erd
\end_inset
has higher priority than
\begin_inset Quotes eld
\end_inset
G
\latex latex
\backslash
nach
\latex default
K
\begin_inset Quotes erd
\end_inset
.
\begin_inset Quotes eld
\end_inset
_
\begin_inset Quotes erd
\end_inset
represents the empty string
\begin_inset Quotes eld
\end_inset
\begin_inset Quotes erd
\end_inset
.
If
\begin_inset Quotes eld
\end_inset
GH
\latex latex
\backslash
nach
\latex default
_
\begin_inset Quotes erd
\end_inset
would stand after
\begin_inset Quotes eld
\end_inset
G
\latex latex
\backslash
nach
\latex default
K
\begin_inset Quotes erd
\end_inset
, the second rule would never match because the algorithm would stop searching
for more rules after the first match.
The above rules transform any
\begin_inset Quotes eld
\end_inset
GH
\begin_inset Quotes erd
\end_inset
to an empty string (delete them) and transform any other
\begin_inset Quotes eld
\end_inset
G
\begin_inset Quotes erd
\end_inset
to
\begin_inset Quotes eld
\end_inset
K
\begin_inset Quotes erd
\end_inset
.
\layout Standard
At the end of the first string of a line (the search string) there may optionall
y stand a number of characters in brackets.
One (only one!) of these characters must fit.
It's comparable with the [ ] brackets in regular expressions.
The rule
\begin_inset Quotes eld
\end_inset
DG(EIY)
\latex latex
\backslash
nach
\latex default
J
\begin_inset Quotes erd
\end_inset
for example would match any
\begin_inset Quotes eld
\end_inset
DGE
\begin_inset Quotes erd
\end_inset
,
\begin_inset Quotes eld
\end_inset
DGI
\begin_inset Quotes erd
\end_inset
and
\begin_inset Quotes eld
\end_inset
DGY
\begin_inset Quotes erd
\end_inset
and replace them with
\begin_inset Quotes eld
\end_inset
J
\begin_inset Quotes erd
\end_inset
.
This way you can reduce several rules to one.
\layout Standard
Behind the search string there can stand one or more dashes (-).
Those search strings will be matched totally but only the beginning of
the string will be replaced.
Furthermore for these rules no follow-up rule will be searched (what this
is will be explained later).
The rule
\begin_inset Quotes eld
\end_inset
TCH--
\latex latex
\backslash
nach
\latex default
_
\begin_inset Quotes erd
\end_inset
will match any word containing
\begin_inset Quotes eld
\end_inset
TCH
\begin_inset Quotes erd
\end_inset
(like
\begin_inset Quotes eld
\end_inset
match
\begin_inset Quotes erd
\end_inset
) but will only replace the first character
\begin_inset Quotes eld
\end_inset
T
\begin_inset Quotes erd
\end_inset
with an empty string.
The number of dashes determines how many characters from the end will not
be replaced.
After the replacement the search for transformation rules continues with
the not replaced
\begin_inset Quotes eld
\end_inset
CH
\begin_inset Quotes erd
\end_inset
!
\layout Standard
If a
\begin_inset Quotes eld
\end_inset
<
\begin_inset Quotes erd
\end_inset
is appended to the search string, the search for replacement rules will
continue with the replacement string and not with the next character of
the word.
The rule
\begin_inset Quotes eld
\end_inset
PH<
\latex latex
\backslash
nach
\latex default
F
\begin_inset Quotes erd
\end_inset
for example would replace
\begin_inset Quotes eld
\end_inset
PH
\begin_inset Quotes erd
\end_inset
with
\begin_inset Quotes eld
\end_inset
F
\begin_inset Quotes erd
\end_inset
and then again start to search for a replacement rule for
\begin_inset Quotes eld
\end_inset
F\SpecialChar \ldots{}
\begin_inset Quotes erd
\end_inset
.
If there would also be rules like
\begin_inset Quotes eld
\end_inset
FO
\latex latex
\backslash
nach
\latex default
O
\begin_inset Quotes erd
\end_inset
and
\begin_inset Quotes eld
\end_inset
F
\latex latex
\backslash
nach
\latex default
_
\begin_inset Quotes erd
\end_inset
then words like
\begin_inset Quotes eld
\end_inset
PHOXYZ
\begin_inset Quotes erd
\end_inset
would be transformed to
\begin_inset Quotes eld
\end_inset
OXYZ
\begin_inset Quotes erd
\end_inset
and any occurrences of
\begin_inset Quotes eld
\end_inset
PH
\begin_inset Quotes erd
\end_inset
that are not followed by an
\begin_inset Quotes eld
\end_inset
O
\begin_inset Quotes erd
\end_inset
will be deleted like
\begin_inset Quotes eld
\end_inset
PHIXYZ
\latex latex
\backslash
nach
\latex default
IXYZ
\begin_inset Quotes erd
\end_inset
.
The second replacement however is not applied if the priority of this rule
is lower than the priority of the first rule.
\layout Standard
Priorities are added to a rule by putting a number between 0 and 9 at the
end of the search string, for example
\begin_inset Quotes eld
\end_inset
ING6
\latex latex
\backslash
nach
\latex default
N
\begin_inset Quotes erd
\end_inset
.
The higher the number the higher is the priority.
\layout Standard
Priorities are especially important for the previously mentioned follow-up
rules.
Follow-up rules are searched beginning from the last string of the first
search string.
This is a bit complicated but I hope this example will make it more clear:
\layout LyX-Code
\layout LyX-Code
CHS X
\newline
CH G
\newline
\newline
HAU--1 H
\newline
\newline
SCH SH
\newline
\layout Standard
In this example
\begin_inset Quotes eld
\end_inset
CHS' in the word
\begin_inset Quotes eld
\end_inset
FUCHS
\begin_inset Quotes erd
\end_inset
would be transformed to
\begin_inset Quotes eld
\end_inset
X
\begin_inset Quotes erd
\end_inset
.
If we take the word
\begin_inset Quotes eld
\end_inset
DURCHSCHNITT
\begin_inset Quotes erd
\end_inset
the things look a bit different.
Here
\begin_inset Quotes eld
\end_inset
CH
\begin_inset Quotes erd
\end_inset
belongs together and
\begin_inset Quotes eld
\end_inset
SCH
\begin_inset Quotes erd
\end_inset
belongs together and both are spoken separately.
The algorithm however first finds the string
\begin_inset Quotes eld
\end_inset
CHS
\begin_inset Quotes erd
\end_inset
which may not be transformed like in the previous word
\begin_inset Quotes eld
\end_inset
FUCHS
\begin_inset Quotes erd
\end_inset
.
At this point the algorithm can find a follow up rule.
It takes the last character of the first matching rule (
\begin_inset Quotes eld
\end_inset
CHS
\begin_inset Quotes erd
\end_inset
) which is
\begin_inset Quotes eld
\end_inset
S
\begin_inset Quotes erd
\end_inset
and looks for the next match, beginning from this character.
What it finds is clear: It finds
\begin_inset Quotes eld
\end_inset
SCH
\latex latex
\backslash
nach
\latex default
SH
\begin_inset Quotes erd
\end_inset
, which has the same priority (no priority means standard priority, which
is 5).
If the priority is the same or higher the follow-up rule will be applied.
Let's take a look at the word
\begin_inset Quotes eld
\end_inset
SCHAUKEL
\begin_inset Quotes erd
\end_inset
.
In this word
\begin_inset Quotes eld
\end_inset
SCH
\begin_inset Quotes erd
\end_inset
belongs together and may not be torn apart.
After the algorithm has found "SCH
\latex latex
\backslash
nach
\latex default
SH" it searches for a follow-up rule for "H"+"AUKEL".
It finds "HAU--1
\latex latex
\backslash
nach
\latex default
H", but does not apply it because its priority is lower than the one of
the first rule.
You see that this is a very powerful feature but it also can easily lead
to mistakes.
If you really don't need this feature you can turn it off by putting the
line
\layout LyX-Code
followup 0
\layout Standard
at the beginning of the phonetic table file.
As mentioned, for rules containing a `-' no follow-up rules are searched
but giving such rules a priority is not totally senseless because they
self can be follow-up rules and in that case the priority makes sense again.
Follow-up rules of follow-up rules are not searched because this is in
fact not needed very often.
\layout Standard
The control character "
\latex latex
\backslash
hoch
\latex default
" says that the search string only matches at the beginning of words so
that the rule "RH
\latex latex
\backslash
hoch
\backslash
nach
\latex default
R" will only apply to words like "RHESUS" but not "PERHAPS".
You can append another "
\latex latex
\backslash
hoch
\latex default
" to the search string.
In that case the algorithm treats the rest of the word totally separately
from first matched string in at beginning.
This is useful for prefixes whose pronunciation does not depend on the
rest of the word and vice versa like "OVER
\latex latex
\backslash
hoch
\backslash
hoch
\latex default
" in English for example.
\layout Standard
The same way as "
\latex latex
\backslash
hoch
\latex default
" works does "$" only apply on words that end with the search string.
"GN$
\latex latex
\backslash
nach
\latex default
N" only matches on words like "SIGN" but not "SIGNUM".
If you use "
\latex latex
\backslash
hoch
\latex default
" and "$" together, both of them must fit "ENOUGH
\latex latex
\backslash
hoch
\latex default
$
\latex latex
\backslash
nach
\latex default
NF" will only match the word "ENOUGH" and nothing else.
\layout Standard
Of course you can combine all of the mentioned control characters but they
must occur in this order:
\family typewriter
< - priority \i \^{}
$
\family default
.
All characters must be written in CAPITAL letters.
\layout Standard
If absolutely no rule can be found --- might happen if you use strange character
s for which you don't have any replacement rule --- the next character will
simply be skipped and the search for replacement rules will continue with
the rest of the word.
\layout Standard
If you want double letters to be reduced to one you must set up a rule like
"LL-
\latex latex
\backslash
nach
\latex default
L".
If double letters in the resulting phonetic word should be allowed, you
must place the line
\layout LyX-Code
collapse_result 0
\layout Standard
at the beginning of your transformation table file; otherwise set the value
to `1'.
The English rules for example strip all vowels from words and so the word
"GOGO" would be transformed to "K" and not to "KK" (as desired) if
\family typewriter
collapse_result
\family default
is set to 1.
That's why the English rules have
\family typewriter
collapse_result
\family default
set to
\family typewriter
0
\family default
.
\layout Subsection
How do I start finally?
\layout Standard
Before you start to write an array of transformation rules, you should be
aware that you have to do some work to make sure that things you do will
result in correct transformation rules.
\layout Subsubsection
Things that come in handy
\layout Standard
First of all you need to have a large word list of the language you want
to make phonetics for.
It should contain about as many words as the dictionary of the spell checker.
If you don't have such a list, you will probably find an Ispell dictionary
at
\begin_inset LatexCommand \url{http://fmg-www.cs.ucla.edu/geoff/ispell-dictionaries.html}
\end_inset
which will help you.
You can then make affix expansion via
\family typewriter
ispell -e
\family default
and then pipe it trough
\family typewriter
\backslash
tr " " "
\backslash
n"
\family default
to put one word on each line.
After that you eventually have to convert special characters like `é' from
Ispell's internal representation to latin1 encoding.
\family typewriter
sed s/e'/é/g
\family default
for example would replace all e' with é.
\layout Standard
The second is that you know how to use regular expressions and know how
to use
\family typewriter
grep
\family default
.
You should for example know that
\layout LyX-Code
grep \i \^{}
[\i \^{}
aeiou]qu[io] wordlist | less
\layout Standard
will show you all words that begin with any character but a, e, i, o or
u and then continue with `qui' or `quo'.
This stuff is important for example to find out if a phonetic replacement
rule you want to set up is valid for all words which match the expression
you want to replace.
Taking a look at the regex(7) man page is a good idea.
\layout Subsubsection
What the phonetic code should do
\layout Standard
Normal text comparison works well as long as the typer misspells a word
because he pressed one key he didn't really want to press.
In this cases mostly one character differs from the original word.
\layout Standard
In cases where the writer didn't know about the correct spelling of the
word however the word may have several characters that differ from the
original word but usually the word would still sound like the original
word.
Someone might think for example that `tough' is spelled `taff'.
No spell checker without phonetic code will come to the idea that this
might be `tough' but a spell checker who knows that `taff' would be pronounced
like `tough' will make good suggestions to the user.
Another example could be `funetik' and `phonetic'.
\layout Standard
From this examples you can see that the phonetic transformation should not
be too fussy and too precise.
If you implement a whole phonetic dictionary as you can find it in books
this will not be very useful because then there could still be many characters
differing from the misspelled and the desired word.
What you should do if you implement the phonetic transformation table is
to reduce the number of used letters to the only really necessary ones.
\layout Standard
Characters that sound similar should be reduced to one.
In English language for example `Z' sounds like `S' and that's why the
transformation rule
\begin_inset Quotes eld
\end_inset
Z
\latex latex
\backslash
nach
\latex default
S
\begin_inset Quotes erd
\end_inset
is present in the replacement table.
`PH' is spoken like `F' and so we have a
\begin_inset Quotes eld
\end_inset
PH
\latex latex
\backslash
nach
\latex default
F
\begin_inset Quotes erd
\end_inset
rule.
\layout Standard
If you take a closer look you will even see that vowels sound very similar
in English language: `contradiction', `cuntradiction', `cantradiction'
or `centradiction' in fact sound nearly the same, don't they? Therefore
the English phonetic replacement rules not only reduce all vowels to one
but even remove them all (removing is done by just setting up no rule for
those letters).
The phonetic code of `contradiction' is `KNTRTKXN' and if you try to read
this letter-monster loud you will hear that it still sound a bit like `contradi
ction'.
You also see that `D' is transformed to `T' because they nearly sound the
same.
\layout Standard
If you think you have found a regularity you should
\emph on
always
\emph default
take your word list and grep for the corresponding regular expression you
want to make a transformation rule for.
An example: If you come to the idea that all English words ending on `ough'
sound like `AF' at the end because you think of `enough' and `tough'.
If you then grep for the corresponding regular expression by
\begin_inset Quotes eld
\end_inset
grep -i ough$ wordlist
\begin_inset Quotes erd
\end_inset
you will see that the rule you wanted to set up is not correct because
the rule doesn't fit to words like `although' or `bough'.
So you have to define your rule more precisely or you have to set up exceptions
if the number of words that differ from the desired rule is not so big.
\layout Standard
Don't forget about follow-up rules which can help in many cases but which
also can lead to many confusions and side effects.
It's also important to write exceptions in front of the more general rules
("GH" before "G" etc.).
\layout Standard
If you think you have set up a number of rules that may produce some good
results try them out! If you run Aspell as
\family typewriter
\family default
\begin_inset Quotes eld
\end_inset
aspell --lang=<> pipe
\begin_inset Quotes erd
\end_inset
you get a prompt at which you can type in words.
If you just type words Aspell checks them and eventually makes suggestions
if they are misspelled.
If you type in
\begin_inset Quotes eld
\end_inset
$$Sw <>
\begin_inset Quotes erd
\end_inset
you will see the phonetic transformation and you can test out if your work
does what you want.
\layout Standard
Another good way to control if changes you apply to your rules don't have
any evil side effects is to create another list from your word list which
contains not only the word of the word list but also the corresponding
phonetic version of this word on the same line.
If you do this one time before the change and one time after the change
you can make a diff (see
\family typewriter
man diff
\family default
) to see what
\emph on
really
\emph default
changed.
To do this use the command
\begin_inset Quotes eld
\end_inset
aspell --lang=<> soundslike
\begin_inset Quotes erd
\end_inset
.
In this mode aspell will output the the original word and then its soundslike
separated by a tab character for each word you give it.
If you are interested in seeing how the algorithm works you can download
a set of useful programs from
\begin_inset LatexCommand \url{http://members.xoom.com/maccy/spell/phonet-utils.tar.gz}
\end_inset
.
This includes a program that produces a list as mentioned above and another
program which illustrates how the algorithm works.
It uses the same transformation table as Aspell and so it helps a lot during
the process of creating a phonetic transformation table for Aspell.
\layout Standard
During your work you should write down your basic ideas so that other people
are able to understand what you did (and you still know about it after
a few weeks).
The English table has a huge documentation appended for example.
\layout Standard
Now you can start experimenting with all the things you just read and perhaps
set up a nice phonetic transformation table for your language to help Aspell
to come up with the best correction suggestions ever seen also for your
language.
Take a look at the Aspell homepage to see if there is already a transformation
table for your language.
If there is one you might also take a look at it to see if it could be
improved.
\layout Standard
If you think that this section helped you or if you think that this is just
a waste of time you can send any feedback to
\emph on
\shape italic
bjoern.jacke@gmx.de
\shape default
.
\layout Section
Controlling the Behavior of Run-together Words
\begin_inset LatexCommand \label{run-together}
\end_inset
\layout Standard
Aspell has support for either unconditionally accepting run-together words
or only accepting certain words in compound formation.
\layout Standard
Support for unconditionally accepting run-together words can either be turned
on in the language data file or as a normal option via the
\series bold
run-together
\series default
option.
The
\series bold
run-together-limit
\series default
options controls the maximum number of words that can be strung together,
the default is normally 255.
The
\series bold
run-together-min
\series default
options controls the minimal length the individual components of the run
together word can be, the default is normally 3.
Both the run-together-limit and run-together-min option may be specified
in both the language data file or as a normal.
The
\series bold
run-together-mid
\series default
option, which may only be specified in the language data file, may be used
to specify up to three optional characters that may appear between individual
words.
\layout Standard
In order for aspell to conditionally only accept certain words in compounds
those words must be flagged when the compiled word list is being created.
The format for each entry is
\layout Quote
<>:C[1][2][3]<>
\layout Standard
The 1, 2, and 3 control if the word is allowed to appear in the begging,
middle, or end of the compound, respectfully.
More than one position flag may be specified.
If none of them are specified it as assumed that the word may appear anywhere.
The C is optional if 1, 2, or 3 is specified.
The <> represents an optional character that may appear after
the word in the formation of the compound if the word is not at the end
of the compound.
If the letter is lowercase than the character may appear after the word,
if it is in uppercase then that letter must appear after the compound.
Only one letter may be specified and it must also be in the list of middle
letters specified via the
\series bold
run-together-mid
\series default
option.
\series bold
\series default
The
\series bold
run-together-limit
\series default
option may also be used to specify the maximum number of words to string
together.
\layout Standard
For example the word list:
\layout Quote
beg:1
\newline
mid:2
\newline
end:3
\newline
any:C
\newline
never
\newline
must:CM
\newline
maybe:Cm
\layout Standard
Means that the word
\begin_inset Quotes eld
\end_inset
beg
\begin_inset Quotes erd
\end_inset
may only appear at the begging of a word, the word
\begin_inset Quotes eld
\end_inset
mid
\begin_inset Quotes erd
\end_inset
at the middle, the word
\begin_inset Quotes eld
\end_inset
end
\begin_inset Quotes erd
\end_inset
at the end, and the word
\begin_inset Quotes eld
\end_inset
any
\begin_inset Quotes erd
\end_inset
any place.
The word
\begin_inset Quotes eld
\end_inset
never
\begin_inset Quotes erd
\end_inset
is never accepted in a compound unless the
\series bold
run-together
\series default
option is set.
The word
\begin_inset Quotes eld
\end_inset
must
\begin_inset Quotes erd
\end_inset
may appear anywhere however it must be followed by an
\begin_inset Quotes eld
\end_inset
m
\begin_inset Quotes erd
\end_inset
, while the word maybe may be followed by an
\begin_inset Quotes eld
\end_inset
m
\begin_inset Quotes erd
\end_inset
.
Given the above word list the following compounds or legal:
\layout Quote
begmidend
\newline
begany
\newline
mustmend
\newline
maybeend
\newline
maybemend
\layout Standard
are all legal, but the following are not:
\layout Quote
begmid
\newline
mustend
\newline
neverany
\layout Standard
Individual words such as
\begin_inset Quotes eld
\end_inset
beg
\begin_inset Quotes erd
\end_inset
are always accepted.
\layout Standard
When the
\series bold
run-together
\series default
option is not set Aspell will only accept words that have been flagged
in a run-together word.
When the
\series bold
run-together
\series default
option is set aspell will accept words which are as least as long as the
value specified in the
\series bold
run-together-min
\series default
option.
If the words length is less than
\series bold
run-together-min
\series default
then it will only accept the word if it has been flagged.
When the
\series bold
run-together
\series default
option is not set the
\series bold
run-together-min
\series default
option is ignored all together.
\layout Standard
Currently Aspell only supports run-together words when checking if a word
is in the dictionary.
When coming up with suggestions Aspell treats the word as a normal word
and does not do anything special.
This means that the suggestions will be virtually meaningless when the
actual word is a run-together.
I plan on more intelligently supporting run-together words when coming
up with suggestions in a future version of Aspell.
\layout Chapter
How Aspell Works
\begin_inset LatexCommand \label{works}
\end_inset
\layout Standard
The magic behind my spell checker comes from merging Lawrence Philips excellent
metaphone algorithm and Ispell's near miss strategy which is inserting
a space or hyphen, interchanging two adjacent letters, changing one letter,
deleting a letter, or adding a letter.
\layout Standard
The process goes something like this.
\layout Enumerate
Convert the misspelled word to its soundslike equivalent (its metaphone
for English words).
\layout Enumerate
Find all words that have a soundslike within one or two edit distances from
the original words soundslike.
The edit distance is the total number of deletions, insertions, exchanges,
or adjacent swaps needed to make one string equivalent to the other.
When set to only look for soundslikes within one edit distance it tries
all possible soundslike combinations and check if each one is in the dictionary.
When set to find all soundslike within two edit distance it scans through
the entire dictionary and quickly scores each soundslike.
The scoring is quick because it will give up if the two soundslikes are
more than two edit distances apart.
\layout Enumerate
Find misspelled words that have a correctly spelled replacement by the same
criteria of step number 2 and 3.
That is the misspelled word in the word pair (such as teh -> the) would
appear in the suggestions list as if it was a correct spelling.
\layout Enumerate
Score the result list and return the words with the lowest score.
The score is roughly the weighed average of the weighed edit distance of
the word to the misspelled word and the soundslike equivalent of the two
words.
The weighted edit distance is like the edit distance except that the various
edits have weights attached to them.
\layout Enumerate
Replace the misspelled words that have correctly spelled replacements with
their replacements and remove any duplicates that might arise because of
this.
\layout Standard
Please note that the soundslike equivalent is a rough approximation of how
the words sounds.
It is not the phoneme of the word by any means.
For more details about exactly how each step is performed please see the
file
\series bold
suggest.cc
\series default
.
For more information on the metaphone algorithm please see the data file
\series bold
english_phonet.dat
\series default
.
\layout LaTeX
\backslash
appendix
\layout Chapter
Changelog
\layout Section*
Changes from .33.7 to .33.7.1 (Aug 20, 2001)
\layout Itemize
Minor manual fixes.
\layout Itemize
Compile fix for Gcc 3.0 and Solaris.
\layout Section*
Changes from .33.6.3 to .33.7 (Aug 2, 2001)
\layout Itemize
Updates to Autoconf 2.50 and switched to the HEAD branch of libtools.
\layout Itemize
Fixed a bug which caused Aspell to crash when typo-analysis is not used
such as when sug-mode is
\series bold
fast
\series default
or
\series bold
bad spellers
\series default
.
\layout Itemize
Added support for typo-analysis even when a soundslike is not used.
\layout Itemize
Fixed a bug which causes extended charters to display incorrectly on some
platforms
\layout Itemize
Compile fixes so that it will compile with Gcc 3.0.
\layout Itemize
Compile fixed which should allow Aspell to compile with Egcs 1.1.
I have not been able to actually test it though.
Please let me know at kevina@users.sourceforge.net if you have have tried
with Egcs 1.1.
\layout Itemize
Compile and configuration script fixes so that USE_FILE_INO will properly
be defined and Aspell will compile correctly when it is defined.
\layout Itemize
More ANSI C++ compliance fixes.
\layout Section*
Changes from .33.6.2 to .33.6.3 (June 3, 2001)
\layout Itemize
Fixed a build problem in the manual/ directory by including manual-text
and manual-html in the distribution.
\layout Section*
Changes from .33.6.1 to .33.6.2 (June 3, 2001)
\layout Itemize
Compile fix so that Aspell will work correctly when not installed in /usr/local.
\layout Itemize
Avoided regenerating the manual unless configured with enable-maintainer-mode.
\layout Itemize
Added the missing documentation files in the scowl directory.
\layout Section*
Changes from .33.6 to .33.6.1 (May 29, 2001)
\layout Itemize
Fixed a formating problem with the manual involving <<.
\layout Itemize
Added a note about creating PWLI files.
\layout Itemize
Removed the space after between the -L and the directory name in the pspell-modu
le/ Makefile which caused problems on some platforms.
\layout Itemize
Added the configure option AM_MAINTAINER_MODE to avoid enabling rules which
often causes generated build files to be rebuild with the wrong version
of Libtool by default.
I don't know why I didn't think to do this a long time ago.
\layout Section*
Changes from .33.5 to .33.6 (May 18, 2001)
\layout Itemize
Fixed a minor bug where some words would have random compound tags attached
to them.
\layout Itemize
Fixed a compile problem on many platforms where fileno is defined as a macro.
\layout Itemize
Updated the description for a few of Aspell's options.
\layout Itemize
Removed the note of Aspell not being able to run when compiled with the
upcoming Gcc 3.0 compiler as things seam to work now.
\layout Itemize
Added a note about Aspell not being able to compile with Egcs 1.1.
\layout Itemize
Added hack to deal with Libtool's interdependencies problem.
See bug #416981 for Pspell for more info.
\layout Section*
Changes from .33 to .33.5 (April 5, 2001)
\layout Itemize
\begin_inset Quotes eld
\end_inset
dump master
\begin_inset Quotes erd
\end_inset
correctly detects which dictionary and language to use based on the LANG
environmental variable.
\layout Itemize
Fixed a problem on Win32 which involves path names that begin with <>:.
\layout Itemize
Bug fixes and enhancements so that Aspell can once again run under MinGW.
You can even use the new full screen interface if Aspell is compiled with
PDCurses.
\layout Itemize
Some major modifications to make Aspell more C++ compliant in order to get
Aspell to compile under the upcoming Gcc 3.0 compiler.
This included only using STL features found in the standard version of
C++.
(Which means Aspell will no longer require using the SGI version of the
STL) This should also make compiling C++ under non-gcc compilers a lot
simpler.
Please not that Aspell still has some problems with the upcoming Gcc 3.0
compiler (see section
\begin_inset LatexCommand \ref{gcc3.0}
\end_inset
for more info).
\layout Itemize
Minor changes to remove some -Wall warnings.
\layout Itemize
Added a hack to that Aspell will properly compile as a shared library under
Solaris.
\layout Itemize
Added a few import missing words to the English word list.
\layout Section*
Changes from .32.6 to .33 (January 28, 2001)
\layout Itemize
Added a new new curses based interface to replace the dumb terminal interface
everyone has been bitching about.
\layout Itemize
Added the ability to give higher priority to words such as "the" instead
of "teh" which are likely to be due to typos.
\layout Itemize
Reorganized the manual so that it is hopefully easier to follow.
\layout Itemize
Ability to automatically select the best dictionary to used based on the
setting of the LANG environmental variable.
\layout Itemize
Expanded the medium dictionary size to include more words which included
the original words found in ispell and eliminated the large size for now.
\layout Itemize
Added three special variant add-on dictionaries.
\layout Itemize
Switched to the multi-language branch of the CVS version of libtool.
\layout Section*
Changes from .32.5 to .32.6 (Nov 8, 2000)
\layout Itemize
Fixed a bug that where Aspell will crash when reading in accented characters
on some platforms.
This fixes bug # 112435.
\layout Itemize
Fixed some other bugs so that it will run under Win32 under CygWin.
Unfortunately it still won't run properly under Mingw.
\layout Itemize
Fixed the mmap test in configure so that it won't fail on some platforms
that use munmap(char *, int) instead of munmap(void *, int).
\layout Itemize
Upgraded to the latest CVS version of libtool which fixed the problem with
using GNU Make under Solaris.
\layout Itemize
Added an option to copy files instead of using symbolic links for the special
\begin_inset Quotes eld
\end_inset
multi
\begin_inset Quotes erd
\end_inset
dictionary files.
\layout Section*
Changes from .32.1 to .32.5 (August 18, 2000)
\layout Itemize
Changed my email from kevinatk at home com to kevina at users sourceforge
net please make a note of the new email address.
\layout Itemize
Added an option to control if the personal replacement dictionary is saved
when the save_all_wls method is called.
\layout Itemize
Brought back the ability to dump the master word list even in the case of
the special
\begin_inset Quotes eld
\end_inset
multi
\begin_inset Quotes erd
\end_inset
lists.
\layout Itemize
Added a large number of hacker related words and some other slang terms
to the medium size word list.
\layout Itemize
Added an
\begin_inset Quotes eld
\end_inset
ispell
\begin_inset Quotes erd
\end_inset
and
\begin_inset Quotes eld
\end_inset
spell
\begin_inset Quotes erd
\end_inset
compatibility script for systems which don't have ispell installed.
They are located in the scripts/ directory and are not installed by default.
\layout Itemize
Manual fixes.
\layout Itemize
Added a note on not using GNU Make on Solaris.
\layout Section*
Changes from .32 to .32.1 (August 5, 2000)
\layout Itemize
Minor compile fixes for recent gcc snapshot.
\layout Itemize
Fixed naming of pwli files.
\layout Itemize
Fixed a bug when aspell will crash when used with certain single letter
flags.
This bug was most noticeable when used with Emacs.
\layout Itemize
Word list changes, see SCOWL Readme.
\layout Itemize
Other miscellaneous changes.
\layout Section*
Changes from .31.1 to .32 (July 23, 2000)
\layout Itemize
Added support for optionally doing without the soundslike data.
\layout Itemize
Greatly reduced the amount of memory used when creating word lists.
\layout Itemize
Added support for ignoring accents when coming up with suggestions.
\layout Itemize
Added support for local-data-dir which is searched before data-dir.
\layout Itemize
Added support for specifying which words may be used in compounds and where
they may be used.
\layout Itemize
Added support for having more than one main word list as well as a special
\begin_inset Quotes eld
\end_inset
multi
\begin_inset Quotes erd
\end_inset
word list files which will allow multiple word lists to be treated as one.
\layout Itemize
Aspell now uses a completely new word list.
\layout Itemize
The apostrophe (') is no longer considered part of the word when it as at
the end of the word such as in
\begin_inset Quotes eld
\end_inset
dogs'
\begin_inset Quotes erd
\end_inset
.
\layout Section*
Changes from .31 to .31.1 (June 18, 2000)
\layout Itemize
Fixed a bug where Aspell would not create a complete dictionary file on
some platforms when the data is 8-bit.
\layout Itemize
Added a workaround so Aspell will work with ispell.el 3.3.
\layout Itemize
Minor compile fixes so it would compile better with the very latest gcc
(CVS Version).
\layout Itemize
Removed note about compiling in Win32 as I was now able to get it to work.
\layout Section*
Changes from .30.1 to .31 (June 11, 2000)
\layout Itemize
Added support for spell checking run together words.
\layout Itemize
Added an option to produce a list of misspelled words from standard input.
\layout Itemize
More robust error reporting when reading in language data files.
\layout Itemize
Fixed a bug when that will cause Aspell to crash if the
\begin_inset Quotes eld
\end_inset
special
\begin_inset Quotes erd
\end_inset
line was not defined in the language data file.
\layout Itemize
Update Pspell Module.
\layout Itemize
Minor bug fixes.
\layout Itemize
Added cross references in
\begin_inset Quotes eld
\end_inset
The Aspell utility Chapter
\begin_inset Quotes erd
\end_inset
for easier use.
\layout Section*
Changes from .30 to .30.1 (April 29, 2000)
\layout Itemize
Ported Aspell to Win32 platforms.
\layout Itemize
Portability fixes which may help aspell compile on other platforms.
\layout Itemize
Aspell will no longer fail if for some reason the mmap fails, instead it
will just read the file in as normal and free the memory when done.
\layout Itemize
Minor changes in the format of the main word list as a result of the changes,
the old format should still work in most cases.
\layout Itemize
Fixed a bug when aspell was ignoring the extension of file names such as
.html or .tex when checking files.
\layout Itemize
Fixed a bug when aspell will go into an infinite loop when creating the
main word list from a word list which has duplicates in it.
\layout Itemize
Minor changes to the manual for better clarity.
\layout Section*
Changes from .29.1 to .30 (April 2, 2000)
\layout Itemize
Fixed many of the capitalization bugs found in previous versions of Aspell.
\layout Itemize
Changed the format of the main word list yet again.
\layout Itemize
Fixed a bug so that
\begin_inset Quotes eld
\end_inset
aspell check
\begin_inset Quotes erd
\end_inset
will work on the PowerPC.
\layout Itemize
Added ability to change configuration options in the middle of a session.
\layout Itemize
Added words from /usr/dict/words found on most Linux systems as well as
a bunch of commonly used abbreviation to the word list.
\layout Itemize
Fixed a bug when aspell will dump core after reporting certain errors when
compiled with gcc 2.95 or higher.
This involved reworked the Exception heritage to get around a bug in gcc
2.95.
\layout Itemize
Added a few more commands to the list of default commands the TeX filter
knows about.
\layout Itemize
Aspell will now check if a word only contains valid characters before adding
it to any dictionaries.
This might mean that you have to manually delete a few words from your
personal word list.
\layout Itemize
Added option to ignore case when checking a document.
\layout Itemize
Adjusted the parameters of the
\begin_inset Quotes eld
\end_inset
normal
\begin_inset Quotes erd
\end_inset
suggest mode to so that significantly less far fetched results are returned
in cases such as tomatoe, which went from 100 suggestions down to 32, at
the expense of getting slightly lower results (less than 1%),
\layout Itemize
Improved the edit distance algorithm for slightly faster results.
\layout Itemize
Removed the $$m command in pipe mode, you should now use
\begin_inset Quotes eld
\end_inset
$$cs mode,<>
\begin_inset Quotes erd
\end_inset
to set the mode and
\begin_inset Quotes eld
\end_inset
$$cr mode
\begin_inset Quotes erd
\end_inset
to find out the current mode.
\layout Itemize
Reworked parts of Aspell to use Pspell services to avoid duplicating code.
\layout Itemize
Added a module for the newly released Pspell.
It will get installed with the rest of aspell.
\layout Itemize
Miscellaneous other bug fixes.
\layout Section*
Changes from .29 to .29.1 (Feb 18, 2000)
\layout Itemize
Improved the TeX filter so that it will accept '@' at the begging of a command
name and ignore trailing '*'s.
It also now has better defaults for which parameters to skip.
\layout Itemize
Reworked the main dictionary so that it can be memory mapped in.
This decreases startup time and allows multiple aspell processes to use
the same memory for the main word list.
This also also made Aspell 64 bit clean so that it should work on an alpha
now.
\layout Itemize
Fix so that aspell can compile on platforms that gnu as is not available
for.
\layout Itemize
Fixed issue with flock so it would compile on FreeBSD.
\layout Itemize
Minor changes in the code to make it more C++ compliant although I am sure
there will still be problems when using some other compiler other than
gcc or egcs.
\layout Itemize
Added some comments to the header files to better document a few of the
classes.
\layout Section*
Changes from .28.3 to .29 (Feb 6, 2000)
\layout Itemize
Fixed a bug in the pipe mode with lines that start with
\begin_inset Quotes eld
\end_inset
^$$
\begin_inset Quotes erd
\end_inset
.
\layout Itemize
Added support for ignoring all words less than or equal to a specified length
\layout Itemize
New soundslike code based thanks to the contribution of Björn Jacke.
It now gets all of its data from a table making it easier for other people
to add soundslike code for their native language.
He also converted the metaphone algorithm to table form, eliminating the
need for the old metaphone code.
\layout Itemize
Major redesign of the suggestion code for better results.
\layout Itemize
Changed the format of the personal word lists.
In most cases it should be converted automatically.
\layout Itemize
Changed the format of the main word list.
\layout Itemize
Name space cleanup for more consistent naming.
I now use name spaces which means that gcc 2.8.* and egcs 1.0.* will no longer
cut it.
\layout Itemize
Used file locks when reading and saving the personal dictionaries so that
it truly multiprocess safe.
\layout Itemize
Added rudimentary filter support.
\layout Itemize
Reworked the configuration system once again.
However, the changes to the end user who does not directly use my library
should be minimal.
\layout Itemize
Rewrote my code that handles parsing command line parameters so that it
no longer uses popt as it was causing to many problems and didn't integrate
well with my new configuration system.
\layout Itemize
Fixed pipe mode so that it will properly ignore lines starting with '~'
for better ispell compatibility.
\layout Itemize
Aspell now has a new home page at
\begin_inset LatexCommand \url{http://aspell.sourceforge.net/}
\end_inset
.
Please make note of the new URL.
\layout Itemize
Miscellaneous manual fixes and clarifications.
\layout Section*
Changes from .28.2.1 to .28.3 (Nov 20, 1999)
\layout Itemize
Fixed a bug that caused aspell to crash when spell checking words over 60
characters long.
\layout Itemize
Reworked
\begin_inset Quotes eld
\end_inset
aspell check
\begin_inset Quotes erd
\end_inset
so that
\begin_deeper
\layout Enumerate
You no longer have to hit enter when making a choice.
\layout Enumerate
It will now overwrite the original file instead of creating a new file.
An optional backup can be made by using the -b option.
\end_deeper
\layout Itemize
Fixed a few bugs in data.cc.
\layout Section*
Changes from .28.2 to .28.2.1 (Aug 25, 1999)
\layout Itemize
Fixed the version number for the shared library.
\layout Itemize
Fixed a problem with undefined references when linking to the shared library.
\layout Section*
Changes from .28.1 to .28.2 (Aug 25, 1999)
\layout Itemize
Fixed a bunch of bugs in the language and configuration classes.
\layout Itemize
Minor changed in the code so that it could compile with the new gcc 2.95(.1).
\layout Itemize
Changed the output of
\begin_inset Quotes eld
\end_inset
dump config
\begin_inset Quotes erd
\end_inset
so that default values are given the value "".
This means that the output can be used to created a configuration file.
\layout Itemize
Added notes on using aspell with VIM.
\layout Section*
Changes from .28 to .28.1 (July 27, 1999)
\layout Itemize
Removed some debug output
\layout Itemize
Changed notes on compiling with gcc 2.8.* as I managed to get it to compile
on my school account
\layout Itemize
Avoided included
\series bold
stdexcept
\series default
in
\series bold
const_string.hh
\series default
so that I could get to compile on my schools account with gcc 2.8.1.
\layout Section*
Changes from .27.2 to .28 (July 25, 1999)
\layout Itemize
Provided an iterator for the replacement classes.
\layout Itemize
Added support for dumping and creating and merging the personal and replacement
word lists.
\layout Itemize
Changed the aspell utility command line a bit, it now used popt.
\layout Itemize
Totally reworked aspell configuration system.
Now aspell can get configuration from any of 5 sources: the command line,
the environmental variable ASPELL_CONF, the personal configuration file,
the global configuration file, and finally the compiled in defaults.
\layout Itemize
Totally reworked the language class in preparation for my new language code.
See
\begin_inset LatexCommand \url{http://aspell.sourceforge.net/international/}
\end_inset
for more information of what I have in store.
\layout Itemize
Added some options to the configure script: --enable-dict-dir=DIR, --enable-doc-
dir=DIR, --enable-debug, and --enable-opt
\layout Itemize
Removed some old header files.
\layout Itemize
Reorganized the directory structure a bit
\layout Itemize
Made the text version of the manual pages slightly easier to read
\layout Itemize
Used the
\backslash
url command for urls for better formating of the printed version.
\layout Section*
Changes from .27.1 to .27.2 (Mar 1, 1999)
\layout Itemize
Fixed a major bug that caused aspell to dump core when used without any
arguments
\layout Itemize
Fixed another major bug that caused aspell to do nothing when used in interactiv
e mode.
\layout Itemize
Added an option to exit in Aspell's interactive mode.
\layout Itemize
Removed some old documentation files from the distribution.
\layout Itemize
Minor changes on to the section on using Aspell with egcs.
\layout Itemize
Minor changes to remove -Wall warnings.
\layout Section*
Changes from .27 to .27.1 (Feb 24, 1999)
\layout Itemize
Fixed a minor compile problem.
\layout Itemize
Updated the section on using Aspell with egcs to it it more clear why the
patch is necessary.
\layout Section*
Changes from .26.2 to .27 (Feb 22, 1999)
\layout Itemize
\series bold
Totally reworked the C++ library which means you may need to change some
things in your code.
\layout Itemize
Added support for detachable and multiple personal dictionaries in the C++
class library.
\layout Itemize
The C++ class library now throws exceptions.
\layout Itemize
Reworked aspell ability to learn from users misspellings a bit so that it
now has a memory.
See section
\begin_inset LatexCommand \ref{replpair}
\end_inset
for more information.
\layout Itemize
Upgraded autoconf to version 2.13 and automake to version 1.4 for better portabili
ty.
\layout Itemize
Fixed the configuration so the
\begin_inset Quotes eld
\end_inset
make dist
\begin_inset Quotes erd
\end_inset
will work.
For now on aspell will be distributed with
\begin_inset Quotes eld
\end_inset
make dist
\begin_inset Quotes erd
\end_inset
.
\layout Itemize
Added support to skip over URL's, email addresses and host names.
\layout Itemize
Added support for dumping the master and personal word list.
You can now also merge a personal word list.
Type aspell -help for help on using this feature.
\layout Itemize
Reorganized the source code.
\layout Itemize
Started using proper version numbers for the shared library.
\layout Itemize
Fixed a bug that caused a spell to crash when adding certain replacement
pairs.
\layout Itemize
Fixed the problem with duplicate lines when exiting pipe mode for good.
\layout Section*
Changed from .26.1 to .26.2 (Jan 3, 1998)
\layout Itemize
Fixed another compile problem.
Hopefully this time it will really compile OK on other peoples machines.
\layout Section*
Changed from .26 to .26.1 (Jan 3, 1998)
\layout Itemize
Fixed a small compile problem in
\series bold
as_data.cc
\series default
.
\layout Section*
Changed from .25.1 to .26 (Jan 3, 1999)
\layout Itemize
Fixed a bug that causes duplicates items to be displayed in the suggestion
list for good.
(If it still does it please send be email.)
\layout Itemize
Added the ability for aspell to learn form the users misspellings.
\layout Itemize
Library Interface changes.
Still more to come....
\layout Itemize
Is now multiprocess safe.
When a personal dictionary (or replacement list) is saved it will now first
update he list against the dictionary on disk in case another process modified
it.
\layout Itemize
Fixed the bug that caused duplicate output when used non interactively in
pipe mode.
\layout Itemize
Dropped support for gcc 2.7.2 as the C++ compiler.
\layout Itemize
Updated the How Aspell Works (
\begin_inset LatexCommand \ref{works}
\end_inset
) Chapter.
\layout Itemize
Added support for the ASPELL_DATA_DIR environmental variable.
\layout Section*
Changes from .25 to .25.1 (Dec 10, 1998)
\layout Itemize
Fixed the version number so that Aspell reports the correct version number.
\layout Itemize
Changed the note on gcc 2.7.2 compilers to make it clear that only the C++
compiler can not be gcc 2.7.2, it is ok if the C compiler is gcc 2.7.2.
\layout Itemize
Updated the TODO list and reorganized it a bit.
\layout Itemize
Fixed the directory so that all the documentation will get installed in
${prefix}/doc/aspell instead of half of it in ${prefix}/doc/aspell and
half of it in ${prefix}/doc/kspell.
\layout Section*
Changes from .24 to .25 (Nov 23, 1998)
\layout Itemize
Total rework of how the main word list is stored.
Start up time decreased to about 1/3 of what it was in .24 and memory usage
decreased to about 2/3.
(When used with the provided word list on a Linux system).
\series bold
\series default
Also the
\series bold
format and default locations of the main word list data files changed
\series default
in the process and the data
\series bold
is now machine dependent
\series default
.
The personal word list format, however, stayed the same.
\layout Itemize
Changed the scoring method to produce slightly better results with words
like the vs.
teh.
And other simpler misspellings where two letters are swapped.
\layout Itemize
Fixed the very unpredictable behavior of the '*', '&', '@' commands in the
pipe mode.
\layout Itemize
Added documentations for Aspell pipe mode (also known as ispell -a compatibility
mode)
\layout Itemize
Added a bunch of Aspell specific extensions to the pipe mode and documented
them.
\layout Itemize
Documented the
\series bold
to_soundslike
\series default
and
\series bold
soundslike
\series default
methods for the
\series bold
aspell
\series default
class.
\layout Itemize
Changed the scoring method to produce better results for words like
\begin_inset Quotes eld
\end_inset
fone
\begin_inset Quotes erd
\end_inset
vs
\begin_inset Quotes eld
\end_inset
phone
\begin_inset Quotes erd
\end_inset
and other words that have a spelling that doesn't directly relate to how
the word sounds by using the phoneme equivalent of the word in the scoring
of it.
\layout Itemize
Added the
\series bold
to_phoneme
\series default
and
\series bold
have_phoneme
\series default
methods to the
\series bold
SC_Language
\series default
class.
\layout Itemize
Added the
\series bold
to_phoneme
\series default
method to the
\series bold
aspell
\series default
class.
\layout Itemize
Added the framework for being able to learn from the users misspelling.
Right now it just keep a log of replacements.
\layout Itemize
Re did
\series bold
stl_rope-30.diff
\series default
.
\series bold
\series default
For some reason the version of patch on my system refused it.
\layout Itemize
Rewrite of the
\begin_inset Quotes eld
\end_inset
Using as a replacement for Ispell
\begin_inset Quotes erd
\end_inset
section and added the
\series bold
run-with-aspell
\series default
utility as a replacement of the old method of mapping Ispell to Aspell.
\layout Itemize
Fixed a bug that caused duplicate words to appear in the suggestion list.
\layout Section*
Changes from .23 to .24 (Nov 8, 1998)
\layout Itemize
Fixed my code so that it can once again compile with g++ 2.7.2.
\layout Itemize
Rewrote the How It Works chapter.
\layout Itemize
Rewrote the Requirement section and added noted on compiling with g++ 2.7.2.
\layout Itemize
Added a To Do chapter.
\layout Itemize
Added a Glossary and References chapter.
\layout Itemize
Other minor documentation improvements.
\layout Itemize
Internal code documentation improvements.
\layout Section*
Changes from .22.1 to .23 (Oct 31, 1998)
\layout Itemize
Minor documentation fixes.
\layout Itemize
Changed the scoring strategy for words with 3 or less letters.
This cut the number of words returned for these roughly in half.
\layout Itemize
Expanded the word list to also include
\series bold
american.0
\series default
and
\series bold
american.1
\series default
from the Ispell distribution.
It now includes
\series bold
english.0
\series default
,
\series bold
english.1
\series default
,
\series bold
american.0
\series default
and
\series bold
american.1
\series default
from the directory
\series bold
languages/english
\series default
provided with Ispell 3.1.20.
\layout Itemize
Added a link to the location of the latest Ispell.el in the documentation.
\layout Itemize
Started a C interface and added some rough documentation for it.
\layout Section*
Changes from .22 to .22.1 (Oct 27, 1998)
\layout Itemize
Minor bug fixes.
I was deleting arrays with delete rather than delete[].
I was suppressed that this had not created a problem.
\layout Itemize
Added a simple test program to test for a memory leak present on some systems.
(Only debian slink at the moment.) See the file memleak-test.cc for more
info.
\layout Section*
Changes from .21 to .22 (Oct 26, 1998)
\layout Itemize
Major redesign or the scoring method.
It now uses absolute distances rather than relative scores for more consistent
results.
See suggest.cc for more info.
\layout Itemize
Suggest code rewritten is several places however the core process stayed
the same.
\layout Itemize
The suggest_ultra method temporally does nothing.
It should be working again by the next release.
\layout Section*
Changes from .20 to .21 (Oct 13, 1998)
\layout Itemize
Added documentation for aspell::Error
\layout Itemize
\series bold
Changed the library name from libspell to libaspell.
\series default
It should never have been libspell in the first place.
Sorry for the incompatibility.
\layout Itemize
Added
\series bold
as_error.hh
\series default
to the list of files copied to the include directory so that you can actually
use the library outside of the source dir.
\layout Itemize
Fixed bug that caused a segmentation fault with words where the only suggestions
was inserting a space or hyphen such as in
\begin_inset Quotes eld
\end_inset
ledgerline
\begin_inset Quotes erd
\end_inset
.
\layout Itemize
Added the
\series bold
score
\series default
method to
\series bold
aspell
\series default
.
\layout Itemize
Changed the scoring method to deal with word when the user uses "f" in place
of "ph" a lot better.
\layout Section*
Changes from .11 to .20 (Oct 10, 1998)
\layout Itemize
\series bold
Name change.
\emph on
\emph default
Everything that was Kspell is now Aspell.
\series default
Sorry, the name Kspell was already used by KDE and I didn't want to cause
any confusion.
\layout Itemize
Fixed a bug that causes a segmentation fault when the HOME environmental
variable doesn't exist.
\layout Section*
Changes from .10 to .11 (Sep 12, 1998)
\layout Itemize
Overhaul of the SC_Language class
\layout Itemize
Added documentation for international support
\layout Itemize
Added documentation for the C++ library
\layout Itemize
Other minor bug fixes.
\layout Chapter
To Do
\layout Standard
Words in bold indicate how you should refer to the item when discussing
it with me or others.
\layout Section
Things that will be done real soon
\layout Standard
These items should get done within a release or two.
\layout Itemize
Totally rewrite the aspell international support.
See
\begin_inset LatexCommand \url{http://aspell.sourceforge.net/international/}
\end_inset
for more information.
\layout Itemize
Rework the aspell
\series bold
check
\series default
function to provide support for using any number of filters which will
be needed for international support.
\layout Itemize
Add support for more intelligently coming up with suggestions for words
that are run-togethers.
\layout Section
Things that need to be done
\layout Standard
Things items will eventually be implemented as I know they are important
however I am not sure when they will get done.
\layout Itemize
Figure out a way for Aspell to work better with
\series bold
extremely large dictionaries
\series default
.
\layout Section
Things that I would like to get done
\layout Standard
These items will eventually be implemented.
I hope to have them all done before I move aspell to beta testing.
They are in the approximate order of when they will get done.
\layout Section
Things that will be done eventually
\layout Standard
I plan on doing these things eventually.
It is just a matter of getting around to it.
\layout Section
Good ideas that are worth implementing
\layout Standard
These items all sound like good ideas however I am not sure when I will
get to implementing then if ever.
If you are looking for a way to contribute picking up on one of these ideas
would be a great way to start.
They are presented in no particular order.
\layout Itemize
Use Lawrence Philips' new Double Metaphone algorithm.
See
\begin_inset LatexCommand \url{http://aspell.sourceforge.net/metaphone/}
\end_inset
.
\layout Itemize
Add support for
\series bold
affix compression.
\layout Itemize
Come up with a plug-in for
\series bold
gEdit
\series default
the gnome text editor.
\layout Itemize
Change languages (and thus dictionaries) based on the information in the
actual document.
\layout Itemize
Come up with a
\series bold
nroff mode
\series default
for spell checking.
\layout Itemize
Come up with a mode that will skip words based on the symbols that (almost)
always surround the word.
(
\series bold
Word skipping by context
\series default
)
\layout Itemize
Create two
\series bold
server modes
\series default
for Aspell.
One that uses the
\series bold
DICT
\series default
protocol and one that uses
\series bold
ispell -a
\series default
method of communication of some arbitrary port.
\layout Itemize
Come up with
\series bold
thread safe personal dictionaries
\series default
.
\layout Itemize
Use the
\series bold
Hidden Markov Model
\series default
to base the suggestions on not only the word itself but on the context around
the word.
\layout Itemize
Having a way to
\series bold
email
\series default
\series bold
the personal dictionary
\series default
and/or replacement list to a particular address either periodical or when
it grows to a certain size.
\layout Itemize
Be able to
\series bold
\series default
accept
\series bold
words with spaces in them
\series default
as many languages have words such as as a word in a foreign phrases which
only make sense when followed by other words.
\layout Standard
The following good ideas where found in the ispell WISHES file so I thought
I would pass them on.
\layout Itemize
Ispell should be smart enough to ignore hyphenation signs, such as the TeX
\backslash
- hyphenation indicator.
\layout Itemize
(Jeff Edmonds) The personal dictionary should be able to remove certain
words from the master dictionary, so that obscure words like "wether" wouldn't
mask favorite typos.
\layout Itemize
(Jeff Edmonds) It would be wonderful if ispell could correct inserted spaces
such as "th e" for "the" or even "can not" for "cannot".
\layout Itemize
Since ispell has dictionaries available to it, it is conceivable that it
could automatically determine the language of a particular file by choosing
the dictionary that produced the fewest spelling errors on the first few
lines.
\layout Section
Things that are not likely to get implemented
\layout Standard
Theses ideas are not likely to get implemented any time soon.
\layout Itemize
(None Yet)
\layout Section
Notes and Status of various items
\layout Subsection
Affix Compression
\begin_inset LatexCommand \label{affixcomp}
\end_inset
\layout Standard
Due to the current way my spell checker works implementing affix compression
would be next to impossible.
Nevertheless, I do realize that for some languages affix compression is
very important.
\layout Standard
So to solve this dilemma I plan on having two different modes of my spell
checker: One with affix compression that does not use soundslike pairs
(much like ispell) and one without affix compression that does use soundslike.
\layout Standard
I plan to extract the affix manipulation code from Ispell with the help
of an Ispell author.
The tricky part would be getting this to getting this all to work properly
at tun time bases on the dictionary used.
\layout Subsection
Extremely Large Dictionaries
\layout Standard
This problem extends back to the fact of the way words are index is Aspell.
This problem will get resolved when I implanted the affix compression mode
as only one index would be used.
\layout Subsection
G
\series bold
eneral region skipping
\layout Standard
I want to implement this give other people an idea of how it should be done
and because I am really sick of having to spell check through url and email
address.
\layout Subsection
Word skipping by context
\layout Standard
This was posted on the Aspell mailing list on January 1, 1999:
\layout Standard
I had an idea on a great general way to determine if a word should be skipped.
Determine the words to skip based on the symbols that (almost) always surround
the word.
\layout Standard
For example when asked to check the following C++ code:
\layout LyX-Code
cout <
\latex latex
\backslash
/
\latex default
< "My age is: " <
\latex latex
\backslash
/
\latex default
< num <
\latex latex
\backslash
/
\latex default
< endl;
\newline
cout <
\latex latex
\backslash
/
\latex default
< "Next year I will be " <
\latex latex
\backslash
/
\latex default
< num + 1 <
\latex latex
\backslash
/
\latex default
< endl;
\layout Standard
cout, num, and endl will all be skipped.
"cout" will be skipped because it is always preceded by a <
\latex latex
\backslash
/
\latex default
<.
"num" will be skipped because it is always preceded by a <
\latex latex
\backslash
/
\latex default
<.
And "endl" will be skipped because it is always between a <
\latex latex
\backslash
/
\latex default
< and a ;.
\layout Standard
Given the following html code.
\layout LyX-Code
\newline
One Two Three
\newline
1 2 3
\newline
\newline
\newline
\layout Standard
table, width cellspacing, cellpadding, tr, td will all be skipped because
they are always enclosed in "<>".
Now of course table and width would be marked as correct anyway however
there is no harm in skipping them.
\layout Standard
So I was wondering if anyone on this list has any experience in writing
this sort of context recognition code or could give me some pointers in
the right direction.
\layout Standard
This sort of word skipping will be very powerful if done right.
I imagine that it could replace specific spell checker modes for Tex, Nroff,
SGML etc because it will automatically be able to figure out where it should
skip words.
It could also probably do a very good job on programming languages code.
\layout Standard
If you are interested in helping be out with this or just have general comments
about the idea please let me know.
\layout Subsection
Hidden Markov Model
\layout Standard
Knud Haugaard Sørensen suggested this one.
From his email on the Aspell mailing list:
\layout Quote
consider this examples.
\layout Quote
a fone number.
-> a phone number.
\newline
a fone dress.
-> a fine dress.
\layout Quote
the example illustrates that the right correction might depend on the context
of the word.
So I suggest that you take a look on HMM to solve this problem.
\layout Quote
This might also provide a good base to include grammar correction in aspell.
\layout Quote
see this link
\begin_inset LatexCommand \url{http://www.cse.ogi.edu/CSLU/HLTsurvey/ch1node7.html}
\end_inset
\layout Standard
I think it is a great idea.
However unfortunately it will probably be very complicated to implement.
Perhaps in the far future.
\layout Subsection
Email the Personal Dictionary
\layout Standard
Some one suggest in a personal email:
\layout Quote
Have you thought of adding a function to aspell, that - when the personal
dictionary has grown significantly - sends the user's personal dictionary
to the maintainer of the corresponding aspell dictionary? (if the user
allows it)
\layout Quote
It would be a very useful service to the dictionary maintainers, and I think
most users can see their benefit in it too.
\layout Standard
And I replied:
\layout Quote
Yes I have considered something like that but not for the personal dictionaries
but rather the replacement word list in order to get better test data for
\begin_inset LatexCommand \url{http://aspell.sourceforge.net/test/}
\end_inset
.
The problem is I don't know of a good way to do this sense Aspell can also
be used as a library.
It also is not a real high priority, especially sense I would first need
to learn how to send email within a C++ program.
\layout Subsection
Words With Spaces in Them
\layout Standard
While this is something I would like to do it is not a simple task.
The basic problem is that when tokenizing a string there is no good way
to keep phrases together.
So the solution is to some how add special conditions to certain words
which will dictate which words can come before/after it.
Then there is also a problem of how to come up with intelligent suggestions.
What further complicates things is that many applications send words to
Aspell a word at a time.
So even if Aspell did support such a thing many applications that would
use Aspell will not.
So, in order for this to work applications will need to send text to Aspell
a document or at least a sentence at a time.
Unfortunately the framework for doing this is not there yet.
It will be once I finish the filter interface.
Another possible is to provide call back functions in which Aspell will
be able to request the previous or next word on request.
Yet again the framework for doing this is not there.
Perhaps sometime in the near future.
\layout Chapter
Support for Gcc 2.7.2
\layout Standard
(and other non-standard compliance compilers)
\layout Standard
My original plan was to program in such a way that it would Aspell would
compile under Gcc 2.7.2 however after releasing a rather nasty bug in 2.7.2
with nested types I have desired to drop all support for Gcc 2.7.2.
As of Aspell .27 all hope for being able to compile under Gcc 2.7.2 is lost
as I am now using many modern C++ features which are simply not present
in Gcc 2.7.2, most notably template specialization and template members.
Egcs 1.1.1 is a very good standards compliant compiler and that is now the
officially supported compiler.
However as Egcs 1.1.1 is relatively new and, except for namespaces, provides
little new functionality I will continue to support Egcs 1.0.3.
Gcc 2.8.1 should in theory work however it is so buggy I have yet to get
it to compile with it nor has anyone else that I know of.
\layout Standard
Yes, my code could be rewritten so that is could compile under Gcc 2.7.2 but
Why? Using modern C++ has probably accelerated the development of this
library my at least 50%.
And for that matter why stop at Gcc 2.7.2 why not go all out and totally
rewrite my code pure C.
I hope you see my point.
\layout Standard
However that does not mean I want to sacrifice portability unnecessarily.
If you see any of part of my code that in not Standard C++ please let me
know.
My hope is that my code could compile on all Standard compliance C++ compilers
with the addition of a few extra header files from SGI's STL.
\layout Standard
As a side note I think that Mozilla's C++ portability guide (
\begin_inset LatexCommand \url{http://www.mozilla.org/docs/tplist/catBuild/portable-cpp.html}
\end_inset
) could be summed up in one sentence: Program in the dark ages of C++.
\layout Chapter
Credits
\layout Itemize
To the many authors of Ispell (including R.
E.
Gorin, Pace Willisson, and Geoff Kuenning) for providing me with a good
word list as well as giving me a few good ideas.
\layout Itemize
Alan Beale for going well beyond the call of duty with helping me create
a better word list.
\layout Itemize
Lawrence Philips for coming up with the original metaphone algorithm and
Michael Kuhn for writing C code for the algorithm.
\layout Itemize
Björn Jacke for coming up with a generic soundslike algorithm which gets
all of its data from a file, thus eliminating almost all need for language
specific code from aspell.
\layout Itemize
To the authors of SGI STL version 3.0 and up for proving a great set of generic
container classes with cut the development time of this program in half
at least.
\layout Itemize
To the LyX development team for giving me a great tool for the development
of this manual.
\layout Chapter
Glossary and References
\layout Description
affix in grammar, a word element that, when added to a word, modifies its
meaning or function; prefix, infix, or suffix.
\layout Description
Debian A 100% Open Source Linux distribution
\begin_inset LatexCommand \url{http://www.debian.org}
\end_inset
\layout Description
DICT\SpecialChar ~
protocol A TCP transaction based query/response protocol that allows
a client to access dictionary definitions from a set of natural language
dictionary databases.
\begin_inset LatexCommand \url{http://www.dict.org/}
\end_inset
\layout Description
Gnome A project to build a complete, user-friendly desktop based entirely
on free software.
\begin_inset LatexCommand \url{http://www.gnome.org/}
\end_inset
\layout Description
GTK+ A library for creating graphical user interfaces for the X Window System.
It is designed to be small, efficient, and flexible.
\begin_inset LatexCommand \url{http://www.gtk.org/}
\end_inset
\layout Description
GUI Graphics User Interface
\layout Description
Ispell.el Emacs interface for ispell.
\begin_inset LatexCommand \url{http://www.kdstevens.com/~stevens/ispell-page.html}
\end_inset
\layout Description
Ispell An international spell checker which is just about the only decent
Open Source spell checker out there.
(except Aspell of course).
\begin_inset LatexCommand \url{http://fmg-www.cs.ucla.edu/geoff/ispell.html}
\end_inset
\layout Description
Ispell\SpecialChar ~
-a An Ispell mode that is designed to be used by other applications
though a pipe.
\layout Description
KDE A powerful graphical desktop environment for Unix workstations.
\begin_inset LatexCommand \url{http://www.kde.org}
\end_inset
\layout Description
Linux A Open Source version of Unix which runs on many platforms.
\begin_inset LatexCommand \url{http://www.linux.org/}
\end_inset
\layout Description
LyX An What You See is What You Mean document editor for the X environment,
\begin_inset LatexCommand \url{http://www.lyx.org}
\end_inset
\layout Description
Open\SpecialChar ~
Source Software where the source code is available for anyone to extend
or modify.
\begin_inset LatexCommand \url{http://www.opensource.org/}
\end_inset
\layout Description
Red\SpecialChar ~
Hat A commercial Linux distribution.
\begin_inset LatexCommand \url{http://www.redhat.com}
\end_inset
\layout Description
RPM Red Hat's packing format also used by other Linux distributions.
\layout Description
STL Standard Temple Library.
A C++ library of container classes, algorithms, and iterators.
\layout Description
SGI\SpecialChar ~
STL Silicon Graphics implantation of the STL.
\begin_inset LatexCommand \url{http://www.sgi.com/Technology/STL/}
\end_inset
\layout Description
STLPort A port of SGI STL designed to run on compilers that don't support
all the latest features of C++.
\begin_inset LatexCommand \url{http://corp.metabyte.com/~fbp/stl/}
\end_inset
\layout Chapter
Copyright
\layout Standard
The library and utility program is Copyrighted 2000 by Kevin Atkinson under
the LGPL.
Certain parts of the library, as indicated at the top of the source file,
are under a weaker license.
The provided word list is extracted from the word lists provided with Ispell
thus it is distributed under the same terms of Ispell.
\layout Standard
The two licenses follow:
\layout Section
LGPL
\layout Standard
\align center
GNU LIBRARY GENERAL PUBLIC LICENSE
\newline
Version 2, June 1991
\layout Standard
Copyright (C) 1991 Free Software Foundation, Inc.
\layout Quote
59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
\layout Standard
Everyone is permitted to copy and distribute verbatim copies of this license
document, but changing it is not allowed.
\layout Standard
[This is the first released version of the library GPL.
It is numbered 2 because it goes with version 2 of the ordinary GPL.]
\layout Standard
\align center
Preamble
\layout Standard
The licenses for most software are designed to take away your freedom to
share and change it.
By contrast, the GNU General Public Licenses are intended to guarantee
your freedom to share and change free software --- to make sure the software
is free for all its users.
\layout Standard
This license, the Library General Public License, applies to some specially
designated Free Software Foundation software, and to any other libraries
whose authors decide to use it.
You can use it for your libraries, too.
\layout Standard
When we speak of free software, we are referring to freedom, not price.
Our General Public Licenses are designed to make sure that you have the
freedom to distribute copies of free software (and charge for this service
if you wish), that you receive source code or can get it if you want it,
that you can change the software or use pieces of it in new free programs;
and that you know you can do these things.
\layout Standard
To protect your rights, we need to make restrictions that forbid anyone
to deny you these rights or to ask you to surrender the rights.
These restrictions translate to certain responsibilities for you if you
distribute copies of the library, or if you modify it.
\layout Standard
For example, if you distribute copies of the library, whether gratis or
for a fee, you must give the recipients all the rights that we gave you.
You must make sure that they, too, receive or can get the source code.
If you link a program with the library, you must provide complete object
files to the recipients so that they can relink them with the library,
after making changes to the library and recompiling it.
And you must show them these terms so they know their rights.
\layout Standard
Our method of protecting your rights has two steps: (1) copyright the library,
and (2) offer you this license which gives you legal permission to copy,
distribute and/or modify the library.
\layout Standard
Also, for each distributor's protection, we want to make certain that everyone
understands that there is no warranty for this free library.
If the library is modified by someone else and passed on, we want its recipient
s to know that what they have is not the original version, so that any problems
introduced by others will not reflect on the original authors' reputations.
\layout Standard
Finally, any free program is threatened constantly by software patents.
We wish to avoid the danger that companies distributing free software will
individually obtain patent licenses, thus in effect transforming the program
into proprietary software.
To prevent this, we have made it clear that any patent must be licensed
for everyone's free use or not licensed at all.
\layout Standard
Most GNU software, including some libraries, is covered by the ordinary
GNU General Public License, which was designed for utility programs.
This license, the GNU Library General Public License, applies to certain
designated libraries.
This license is quite different from the ordinary one; be sure to read
it in full, and don't assume that anything in it is the same as in the
ordinary license.
\layout Standard
The reason we have a separate public license for some libraries is that
they blur the distinction we usually make between modifying or adding to
a program and simply using it.
Linking a program with a library, without changing the library, is in some
sense simply using the library, and is analogous to running a utility program
or application program.
However, in a textual and legal sense, the linked executable is a combined
work, a derivative of the original library, and the ordinary General Public
License treats it as such.
\layout Standard
Because of this blurred distinction, using the ordinary General Public License
for libraries did not effectively promote software sharing, because most
developers did not use the libraries.
We concluded that weaker conditions might promote sharing better.
\layout Standard
However, unrestricted linking of non-free programs would deprive the users
of those programs of all benefit from the free status of the libraries
themselves.
This Library General Public License is intended to permit developers of
non-free programs to use free libraries, while preserving your freedom
as a user of such programs to change the free libraries that are incorporated
in them.
(We have not seen how to achieve this as regards changes in header files,
but we have achieved it as regards changes in the actual functions of the
Library.) The hope is that this will lead to faster development of free
libraries.
\layout Standard
The precise terms and conditions for copying, distribution and modification
follow.
Pay close attention to the difference between a "work based on the library"
and a "work that uses the library".
The former contains code derived from the library, while the latter only
works together with the library.
\layout Standard
Note that it is possible for a library to be covered by the ordinary General
Public License rather than by this special one.
\layout Standard
\align center
GNU LIBRARY GENERAL PUBLIC LICENSE
\newline
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
\layout Standard
0.
This License Agreement applies to any software library which contains a
notice placed by the copyright holder or other authorized party saying
it may be distributed under the terms of this Library General Public License
(also called "this License").
Each licensee is addressed as "you".
\layout Standard
A "library" means a collection of software functions and/or data prepared
so as to be conveniently linked with application programs (which use some
of those functions and data) to form executables.
\layout Standard
The "Library", below, refers to any such software library or work which
has been distributed under these terms.
A "work based on the Library" means either the Library or any derivative
work under copyright law: that is to say, a work containing the Library
or a portion of it, either verbatim or with modifications and/or translated
straightforwardly into another language.
(Hereinafter, translation is included without limitation in the term "modificat
ion".)
\layout Standard
"Source code" for a work means the preferred form of the work for making
modifications to it.
For a library, complete source code means all the source code for all modules
it contains, plus any associated interface definition files, plus the scripts
used to control compilation and installation of the library.
\layout Standard
Activities other than copying, distribution and modification are not covered
by this License; they are outside its scope.
The act of running a program using the Library is not restricted, and output
from such a program is covered only if its contents constitute a work based
on the Library (independent of the use of the Library in a tool for writing
it).
Whether that is true depends on what the Library does and what the program
that uses the Library does.
1.
You may copy and distribute verbatim copies of the Library's complete source
code as you receive it, in any medium, provided that you conspicuously
and appropriately publish on each copy an appropriate copyright notice
and disclaimer of warranty; keep intact all the notices that refer to this
License and to the absence of any warranty; and distribute a copy of this
License along with the Library.
\layout Standard
You may charge a fee for the physical act of transferring a copy, and you
may at your option offer warranty protection in exchange for a fee.
\layout Standard
2.
You may modify your copy or copies of the Library or any portion of it,
thus forming a work based on the Library, and copy and distribute such
modifications or work under the terms of Section 1 above, provided that
you also meet all of these conditions:
\layout Standard
\pextra_type 1 \pextra_width 0.5in
a) The modified work must itself be a software library.
\layout Standard
\pextra_type 1 \pextra_width 0.5in
b) You must cause the files modified to carry prominent notices stating
that you changed the files and the date of any change.
\layout Standard
\pextra_type 1 \pextra_width 0.5in
c) You must cause the whole of the work to be licensed at no charge to all
third parties under the terms of this License.
\layout Standard
\pextra_type 1 \pextra_width 0.5in
d) If a facility in the modified Library refers to a function or a table
of data to be supplied by an application program that uses the facility,
other than as an argument passed when the facility is invoked, then you
must make a good faith effort to ensure that, in the event an application
does not supply such function or table, the facility still operates, and
performs whatever part of its purpose remains meaningful.
\layout Standard
(For example, a function in a library to compute square roots has a purpose
that is entirely well-defined independent of the application.
Therefore, Subsection 2d requires that any application-supplied function
or table used by this function must be optional: if the application does
not supply it, the square root function must still compute square roots.)
\layout Standard
These requirements apply to the modified work as a whole.
If identifiable sections of that work are not derived from the Library,
and can be reasonably considered independent and separate works in themselves,
then this License, and its terms, do not apply to those sections when you
distribute them as separate works.
But when you distribute the same sections as part of a whole which is a
work based on the Library, the distribution of the whole must be on the
terms of this License, whose permissions for other licensees extend to
the entire whole, and thus to each and every part regardless of who wrote
it.
\layout Standard
Thus, it is not the intent of this section to claim rights or contest your
rights to work written entirely by you; rather, the intent is to exercise
the right to control the distribution of derivative or collective works
based on the Library.
\layout Standard
In addition, mere aggregation of another work not based on the Library with
the Library (or with a work based on the Library) on a volume of a storage
or distribution medium does not bring the other work under the scope of
this License.
\layout Standard
3.
You may opt to apply the terms of the ordinary GNU General Public License
instead of this License to a given copy of the Library.
To do this, you must alter all the notices that refer to this License,
so that they refer to the ordinary GNU General Public License, version
2, instead of to this License.
(If a newer version than version 2 of the ordinary GNU General Public License
has appeared, then you can specify that version instead if you wish.) Do
not make any other change in these notices.
\layout Standard
Once this change is made in a given copy, it is irreversible for that copy,
so the ordinary GNU General Public License applies to all subsequent copies
and derivative works made from that copy.
\layout Standard
This option is useful when you wish to copy part of the code of the Library
into a program that is not a library.
\layout Standard
4.
You may copy and distribute the Library (or a portion or derivative of
it, under Section 2) in object code or executable form under the terms
of Sections 1 and 2 above provided that you accompany it with the complete
corresponding machine-readable source code, which must be distributed under
the terms of Sections 1 and 2 above on a medium customarily used for software
interchange.
\layout Standard
If distribution of object code is made by offering access to copy from a
designated place, then offering equivalent access to copy the source code
from the same place satisfies the requirement to distribute the source
code, even though third parties are not compelled to copy the source along
with the object code.
\layout Standard
5.
A program that contains no derivative of any portion of the Library, but
is designed to work with the Library by being compiled or linked with it,
is called a "work that uses the Library".
Such a work, in isolation, is not a derivative work of the Library, and
therefore falls outside the scope of this License.
\layout Standard
However, linking a "work that uses the Library" with the Library creates
an executable that is a derivative of the Library (because it contains
portions of the Library), rather than a "work that uses the library".
The executable is therefore covered by this License.
Section 6 states terms for distribution of such executables.
\layout Standard
When a "work that uses the Library" uses material from a header file that
is part of the Library, the object code for the work may be a derivative
work of the Library even though the source code is not.
Whether this is true is especially significant if the work can be linked
without the Library, or if the work is itself a library.
The threshold for this to be true is not precisely defined by law.
\layout Standard
If such an object file uses only numerical parameters, data structure layouts
and accessors, and small macros and small inline functions (ten lines or
less in length), then the use of the object file is unrestricted, regardless
of whether it is legally a derivative work.
(Executables containing this object code plus portions of the Library will
still fall under Section 6.)
\layout Standard
Otherwise, if the work is a derivative of the Library, you may distribute
the object code for the work under the terms of Section 6.
Any executables containing that work also fall under Section 6, whether
or not they are linked directly with the Library itself.
\layout Standard
6.
As an exception to the Sections above, you may also compile or link a "work
that uses the Library" with the Library to produce a work containing portions
of the Library, and distribute that work under terms of your choice, provided
that the terms permit modification of the work for the customer's own use
and reverse engineering for debugging such modifications.
\layout Standard
You must give prominent notice with each copy of the work that the Library
is used in it and that the Library and its use are covered by this License.
You must supply a copy of this License.
If the work during execution displays copyright notices, you must include
the copyright notice for the Library among them, as well as a reference
directing the user to the copy of this License.
Also, you must do one of these things:
\layout Standard
\pextra_type 1 \pextra_width 0.5in
a) Accompany the work with the complete corresponding machine-readable source
code for the Library including whatever changes were used in the work (which
must be distributed under Sections 1 and 2 above); and, if the work is
an executable linked with the Library, with the complete machine-readable
"work that uses the Library", as object code and/or source code, so that
the user can modify the Library and then relink to produce a modified executabl
e containing the modified Library.
(It is understood that the user who changes the contents of definitions
files in the Library will not necessarily be able to recompile the application
to use the modified definitions.)
\layout Standard
\pextra_type 1 \pextra_width 0.5in
b) Accompany the work with a written offer, valid for at least three years,
to give the same user the materials specified in Subsection 6a, above,
for a charge no more than the cost of performing this distribution.
\layout Standard
\pextra_type 1 \pextra_width 0.5in
c) If distribution of the work is made by offering access to copy from a
designated place, offer equivalent access to copy the above specified materials
from the same place.
\layout Standard
\pextra_type 1 \pextra_width 0.5in
d) Verify that the user has already received a copy of these materials or
that you have already sent this user a copy.
\layout Standard
For an executable, the required form of the "work that uses the Library"
must include any data and utility programs needed for reproducing the executabl
e from it.
However, as a special exception, the source code distributed need not include
anything that is normally distributed (in either source or binary form)
with the major components (compiler, kernel, and so on) of the operating
system on which the executable runs, unless that component itself accompanies
the executable.
\layout Standard
It may happen that this requirement contradicts the license restrictions
of other proprietary libraries that do not normally accompany the operating
system.
Such a contradiction means you cannot use both them and the Library together
in an executable that you distribute.
\layout Standard
7.
You may place library facilities that are a work based on the Library side-by-s
ide in a single library together with other library facilities not covered
by this License, and distribute such a combined library, provided that
the separate distribution of the work based on the Library and of the other
library facilities is otherwise permitted, and provided that you do these
two things:
\layout Standard
\pextra_type 1 \pextra_width 0.5in
a) Accompany the combined library with a copy of the same work based on
the Library, uncombined with any other library facilities.
This must be distributed under the terms of the Sections above.
\layout Standard
\pextra_type 1 \pextra_width 0.5in
b) Give prominent notice with the combined library of the fact that part
of it is a work based on the Library, and explaining where to find the
accompanying uncombined form of the same work.
\layout Standard
8.
You may not copy, modify, sublicense, link with, or distribute the Library
except as expressly provided under this License.
Any attempt otherwise to copy, modify, sublicense, link with, or distribute
the Library is void, and will automatically terminate your rights under
this License.
However, parties who have received copies, or rights, from you under this
License will not have their licenses terminated so long as such parties
remain in full compliance.
\layout Standard
9.
You are not required to accept this License, since you have not signed
it.
However, nothing else grants you permission to modify or distribute the
Library or its derivative works.
These actions are prohibited by law if you do not accept this License.
Therefore, by modifying or distributing the Library (or any work based
on the Library), you indicate your acceptance of this License to do so,
and all its terms and conditions for copying, distributing or modifying
the Library or works based on it.
\layout Standard
10.
Each time you redistribute the Library (or any work based on the Library),
the recipient automatically receives a license from the original licensor
to copy, distribute, link with or modify the Library subject to these terms
and conditions.
You may not impose any further restrictions on the recipients' exercise
of the rights granted herein.
You are not responsible for enforcing compliance by third parties to this
License.
\layout Standard
11.
If, as a consequence of a court judgment or allegation of patent infringement
or for any other reason (not limited to patent issues), conditions are
imposed on you (whether by court order, agreement or otherwise) that contradict
the conditions of this License, they do not excuse you from the conditions
of this License.
If you cannot distribute so as to satisfy simultaneously your obligations
under this License and any other pertinent obligations, then as a consequence
you may not distribute the Library at all.
For example, if a patent license would not permit royalty-free redistribution
of the Library by all those who receive copies directly or indirectly through
you, then the only way you could satisfy both it and this License would
be to refrain entirely from distribution of the Library.
\layout Standard
If any portion of this section is held invalid or unenforceable under any
particular circumstance, the balance of the section is intended to apply,
and the section as a whole is intended to apply in other circumstances.
\layout Standard
It is not the purpose of this section to induce you to infringe any patents
or other property right claims or to contest validity of any such claims;
this section has the sole purpose of protecting the integrity of the free
software distribution system which is implemented by public license practices.
Many people have made generous contributions to the wide range of software
distributed through that system in reliance on consistent application of
that system; it is up to the author/donor to decide if he or she is willing
to distribute software through any other system and a licensee cannot impose
that choice.
\layout Standard
This section is intended to make thoroughly clear what is believed to be
a consequence of the rest of this License.
\layout Standard
12.
If the distribution and/or use of the Library is restricted in certain
countries either by patents or by copyrighted interfaces, the original
copyright holder who places the Library under this License may add an explicit
geographical distribution limitation excluding those countries, so that
distribution is permitted only in or among countries not thus excluded.
In such case, this License incorporates the limitation as if written in
the body of this License.
\layout Standard
13.
The Free Software Foundation may publish revised and/or new versions of
the Library General Public License from time to time.
Such new versions will be similar in spirit to the present version, but
may differ in detail to address new problems or concerns.
\layout Standard
Each version is given a distinguishing version number.
If the Library specifies a version number of this License which applies
to it and "any later version", you have the option of following the terms
and conditions either of that version or of any later version published
by the Free Software Foundation.
If the Library does not specify a license version number, you may choose
any version ever published by the Free Software Foundation.
\layout Standard
14.
If you wish to incorporate parts of the Library into other free programs
whose distribution conditions are incompatible with these, write to the
author to ask for permission.
For software which is copyrighted by the Free Software Foundation, write
to the Free Software Foundation; we sometimes make exceptions for this.
Our decision will be guided by the two goals of preserving the free status
of all derivatives of our free software and of promoting the sharing and
reuse of software generally.
\layout Standard
\align center
NO WARRANTY
\layout Standard
15.
BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR
THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW.
EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER
PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER
EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE LIBRARY IS WITH
YOU.
SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY
SERVICING, REPAIR OR CORRECTION.
\layout Standard
16.
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL
ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE
THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING
ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF
THE USE OR INABILITY TO USE THE LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS
OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR
THIRD PARTIES OR A FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE),
EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY
OF SUCH DAMAGES.
\layout Standard
\align center
END OF TERMS AND CONDITIONS
\layout Standard
\align center
How to Apply These Terms to Your New Libraries
\layout Standard
If you develop a new library, and you want it to be of the greatest possible
use to the public, we recommend making it free software that everyone can
redistribute and change.
You can do so by permitting redistribution under these terms (or, alternatively
, under the terms of the ordinary General Public License).
\layout Standard
To apply these terms, attach the following notices to the library.
It is safest to attach them to the start of each source file to most effectivel
y convey the exclusion of warranty; and each file should have at least the
"copyright" line and a pointer to where the full notice is found.
\layout Quote
\newline
Copyright (C)
\layout Quote
This library is free software; you can redistribute it and/or modify it
under the terms of the GNU Library General Public License as published
by the Free Software Foundation; either version 2 of the License, or (at
your option) any later version.
\layout Quote
This library is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
FOR A PARTICULAR PURPOSE.
See the GNU Library General Public License for more details.
\layout Quote
You should have received a copy of the GNU Library General Public License
along with this library; if not, write to the Free Foundation, Inc., 59
Temple Place, Suite 330, Boston, MA 02111-1307 USA
\layout Standard
Also add information on how to contact you by electronic and paper mail.
\layout Standard
You should also get your employer (if you work as a programmer) or your
school, if any, to sign a "copyright disclaimer" for the library, if necessary.
Here is a sample; alter the names:
\layout Quote
Yoyodyne, Inc., hereby disclaims all copyright interest in the library `Frob'
(a library for tweaking knobs) written by James Random Hacker.
\layout Quote
, 1 April 1990 Ty Coon, President of Vice
\layout Standard
That's all there is to it!
\layout Section
Ispell Copyright
\layout Standard
Copyright (c), 1983, by Pace Willisson
\layout Standard
Copyright 1992, 1993, Geoff Kuenning, Granada Hills, CA All rights reserved.
\layout Standard
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
\layout Standard
1.
Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
\layout Standard
2.
Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
\layout Standard
3.
All modifications to the source code must be clearly marked as such.
Binary redistributions based on modified source code must be clearly marked
as modified versions in the documentation and/or other materials provided
with the distribution.
\layout Standard
4.
All advertising materials mentioning features or use of this software must
display the following acknowledgment:
\layout Quote
This product includes software developed by Geoff Kuenning and other unpaid
contributors.
\layout Standard
5.
The name of Geoff Kuenning may not be used to endorse or promote products
derived from this software without specific prior written permission.
\layout Standard
THIS SOFTWARE IS PROVIDED BY GEOFF KUENNING AND CONTRIBUTORS ``AS IS'' AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED.
IN NO EVENT SHALL GEOFF KUENNING OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
\the_end