GNU Info

Info Node: (textutils.info)Squeezing

(textutils.info)Squeezing


Next: Warnings in tr Prev: Translating Up: tr invocation
Enter node , (file) or (file)node

Squeezing repeats and deleting
------------------------------

   When given just the `--delete' (`-d') option, `tr' removes any input
characters that are in SET1.

   When given just the `--squeeze-repeats' (`-s') option, `tr' replaces
each input sequence of a repeated character that is in SET1 with a
single occurrence of that character.

   When given both `--delete' and `--squeeze-repeats', `tr' first
performs any deletions using SET1, then squeezes repeats from any
remaining characters using SET2.

   The `--squeeze-repeats' option may also be used when translating, in
which case `tr' first performs translation, then squeezes repeats from
any remaining characters using SET2.

   Here are some examples to illustrate various combinations of options:

   * Remove all zero bytes:

          tr -d '\000'

   * Put all words on lines by themselves.  This converts all
     non-alphanumeric characters to newlines, then squeezes each string
     of repeated newlines into a single newline:

          tr -cs 'a-zA-Z0-9' '[\n*]'

   * Convert each sequence of repeated newlines to a single newline:

          tr -s '\n'

   * Find doubled occurrences of words in a document.  For example,
     people often write "the the" with the duplicated words separated
     by a newline.  The bourne shell script below works first by
     converting each sequence of punctuation and blank characters to a
     single newline.  That puts each "word" on a line by itself.  Next
     it maps all uppercase characters to lower case, and finally it
     runs `uniq' with the `-d' option to print out only the words that
     were adjacent duplicates.

          #!/bin/sh
          cat "$@" \
            | tr -s '[:punct:][:blank:]' '\n' \
            | tr '[:upper:]' '[:lower:]' \
            | uniq -d



automatically generated by info2www version 1.2.2.9