Copyright (C) 2000-2012 |
GNU Info (emacs-lisp-intro.info)Words and SymbolsWhat to Count? ============== When we first start thinking about how to count the words in a function definition, the first question is (or ought to be) what are we going to count? When we speak of `words' with respect to a Lisp function definition, we are actually speaking, in large part, of `symbols'. For example, the following `multiply-by-seven' function contains the five symbols `defun', `multiply-by-seven', `number', `*', and `7'. In addition, in the documentation string, it contains the four words `Multiply', `NUMBER', `by', and `seven'. The symbol `number' is repeated, so the definition contains a total of ten words and symbols. (defun multiply-by-seven (number) "Multiply NUMBER by seven." (* 7 number)) However, if we mark the `multiply-by-seven' definition with `C-M-h' (`mark-defun'), and then call `count-words-region' on it, we will find that `count-words-region' claims the definition has eleven words, not ten! Something is wrong! The problem is twofold: `count-words-region' does not count the `*' as a word, and it counts the single symbol, `multiply-by-seven', as containing three words. The hyphens are treated as if they were interword spaces rather than intraword connectors: `multiply-by-seven' is counted as if it were written `multiply by seven'. The cause of this confusion is the regular expression search within the `count-words-region' definition that moves point forward word by word. In the canonical version of `count-words-region', the regexp is: "\\w+\\W*" This regular expression is a pattern defining one or more word constituent characters possibly followed by one or more characters that are not word constituents. What is meant by `word constituent characters' brings us to the issue of syntax, which is worth a section of its own. automatically generated by info2www version 1.2.2.9 |