Whole document tree 4. Locale Setup4.1. Files and the kernelYou can now use any Unicode characters in file names. No kernel or file utilities need modifications. This is because file names in the kernel can be anything not containing a null byte, and '/' is used to delimit subdirectories. When encoded using UTF-8, non-ASCII characters will never be encoded using null bytes or slashes. All that happens is that file and directory names occupy more bytes than they contain characters. For example, a filename consisting of five greek characters will appear to the kernel as a 10-byte filename. The kernel does not know (and does not need to know) that these bytes are displayed as greek. This is the general theory, so long as your files reside on Linux. On filesystems which are used from other operating systems, you have mount options to control conversion of filenames to or from UTF-8:
The other filesystems (nfs, smbfs, ncpfs, hpfs, etc.) don't convert filenames; therefore they support Unicode file names in UTF-8 encoding only if the other operating system supports them. Please note that to enable a mount option for all future remounts, you add it to the fourth column of the corresponding /etc/fstab line. 4.2. Locale environment variablesYou should have the following environment variables set, containing locale names:
In order to tell your system and all applications that you are using UTF-8, you need to add a codeset suffix of UTF-8 to your locale names. For example, if you want to run an application in UTF-8 Hindi locale then with bash shell, you can specify which environment variable to be passed to the application.
|