Non-ASCII Characters in Strings
...............................
You can include a non-ASCII international character in a string
constant by writing it literally. There are two text representations
for non-ASCII characters in Emacs strings (and in buffers): unibyte and
multibyte. If the string constant is read from a multibyte source,
such as a multibyte buffer or string, or a file that would be visited as
multibyte, then the character is read as a multibyte character, and that
makes the string multibyte. If the string constant is read from a
unibyte source, then the character is read as unibyte and that makes the
string unibyte.
You can also represent a multibyte non-ASCII character with its
character code: use a hex escape, `\xNNNNNNN', with as many digits as
necessary. (Multibyte non-ASCII character codes are all greater than
256.) Any character which is not a valid hex digit terminates this
construct. If the next character in the string could be interpreted as
a hex digit, write `\ ' (backslash and space) to terminate the hex
escape--for example, `\x8e0\ ' represents one character, `a' with grave
accent. `\ ' in a string constant is just like backslash-newline; it
does not contribute any character to the string, but it does terminate
the preceding hex escape.
Using a multibyte hex escape forces the string to multibyte. You can
represent a unibyte non-ASCII character with its character code, which
must be in the range from 128 (0200 octal) to 255 (0377 octal). This
forces a unibyte string.
Note:Text Representations, for more information about the two
text representations.