Copyright (C) 2000-2012 |
GNU Info (textutils.info)SqueezingSqueezing repeats and deleting ------------------------------ When given just the `--delete' (`-d') option, `tr' removes any input characters that are in SET1. When given just the `--squeeze-repeats' (`-s') option, `tr' replaces each input sequence of a repeated character that is in SET1 with a single occurrence of that character. When given both `--delete' and `--squeeze-repeats', `tr' first performs any deletions using SET1, then squeezes repeats from any remaining characters using SET2. The `--squeeze-repeats' option may also be used when translating, in which case `tr' first performs translation, then squeezes repeats from any remaining characters using SET2. Here are some examples to illustrate various combinations of options: * Remove all zero bytes: tr -d '\000' * Put all words on lines by themselves. This converts all non-alphanumeric characters to newlines, then squeezes each string of repeated newlines into a single newline: tr -cs 'a-zA-Z0-9' '[\n*]' * Convert each sequence of repeated newlines to a single newline: tr -s '\n' * Find doubled occurrences of words in a document. For example, people often write "the the" with the duplicated words separated by a newline. The bourne shell script below works first by converting each sequence of punctuation and blank characters to a single newline. That puts each "word" on a line by itself. Next it maps all uppercase characters to lower case, and finally it runs `uniq' with the `-d' option to print out only the words that were adjacent duplicates. #!/bin/sh cat "$@" \ | tr -s '[:punct:][:blank:]' '\n' \ | tr '[:upper:]' '[:lower:]' \ | uniq -d automatically generated by info2www version 1.2.2.9 |