(emacs-lisp-intro.info)Words and Symbols


What to Count?
==============

   When we first start thinking about how to count the words in a
function definition, the first question is (or ought to be) what are we
going to count?  When we speak of `words' with respect to a Lisp
function definition, we are actually speaking, in large part, of
`symbols'.  For example, the following `multiply-by-seven' function
contains the five symbols `defun', `multiply-by-seven', `number', `*',
and `7'.  In addition, in the documentation string, it contains the
four words `Multiply', `NUMBER', `by', and `seven'.  The symbol
`number' is repeated, so the definition contains a total of ten words
and symbols.

     (defun multiply-by-seven (number)
       "Multiply NUMBER by seven."
       (* 7 number))

However, if we mark the `multiply-by-seven' definition with `C-M-h'
(`mark-defun'), and then call `count-words-region' on it, we will find
that `count-words-region' claims the definition has eleven words, not
ten!  Something is wrong!

   The problem is twofold: `count-words-region' does not count the `*'
as a word, and it counts the single symbol, `multiply-by-seven', as
containing three words.  The hyphens are treated as if they were
interword spaces rather than intraword connectors: `multiply-by-seven'
is counted as if it were written `multiply by seven'.

   The cause of this confusion is the regular expression search within
the `count-words-region' definition that moves point forward word by
word.  In the canonical version of `count-words-region', the regexp is:

     "\\w+\\W*"

This regular expression is a pattern defining one or more word
constituent characters possibly followed by one or more characters that
are not word constituents.  What is meant by `word constituent
characters' brings us to the issue of syntax, which is worth a section
of its own.

automatically generated by info2www version 1.2.2.9