Info Node: (emacs-lisp-intro.info)the-the

www.fifi.org
    Documentation
        Manpages
        GNU Info
        Debian document tree
        Whole document tree
    Trigance web page
    Public services
    User info
    Mailing lists
    Secure server
    Multilingual usage

Validate HTML
Validate CSS

(emacs-lisp-intro.info)the-the

The `the-the' Function ********************** Sometimes when you you write text, you duplicate words--as with "you you" near the beginning of this sentence. I find that most frequently, I duplicate "the'; hence, I call the function for detecting duplicated words, `the-the'. As a first step, you could use the following regular expression to search for duplicates: \\(\\w+[ \t\n]+\\)\\1 This regexp matches one or more word-constituent characters followed by one or more spaces, tabs, or newlines. However, it does not detect duplicated words on different lines, since the ending of the first word, the end of the line, is different from the ending of the second word, a space. (For more information about regular expressions, see Note: Regular Expression Searches, as well as Note: Syntax of Regular Expressions, and Note: Regular Expressions.) You might try searching just for duplicated word-constituent characters but that does not work since the pattern detects doubles such as the two occurrences of `th' in `with the'. Another possible regexp searches for word-constituent characters followed by non-word-constituent characters, reduplicated. Here, `\\w+' matches one or more word-constituent characters and `\\W*' matches zero or more non-word-constituent characters. \\(\\(\\w+\\)\\W*\\)\\1 Again, not useful. Here is the pattern that I use. It is not perfect, but good enough. `\\b' matches the empty string, provided it is at the beginning or end of a word; `[^@ \n\t]+' matches one or more occurrences of any characters that are _not_ an @-sign, space, newline, or tab. \\b\\([^@ \n\t]+\\)[ \n\t]+\\1\\b One can write more complicated expressions, but I found that this expression is good enough, so I use it. Here is the `the-the' function, as I include it in my `.emacs' file, along with a handy global key binding: (defun the-the () "Search forward for for a duplicated word." (interactive) (message "Searching for for duplicated words ...") (push-mark) ;; This regexp is not perfect ;; but is fairly good over all: (if (re-search-forward "\\b\\([^@ \n\t]+\\)[ \n\t]+\\1\\b" nil 'move) (message "Found duplicated word.") (message "End of buffer"))) ;; Bind `the-the' to C-c \ (global-set-key "\C-c\\" 'the-the) Here is test text: one two two three four five five six seven You can substitute the other regular expressions shown above in the function definition and try each of them on this list.

automatically generated by

info2www

version 1.2.2.9