/proc is a pseudo-filesystem which is used as an interface to kernel
data structures rather than reading and interpreting /dev/kmem.
Most of it is read-only, but some files allow kernel variables to be changed.
The following outline gives a quick tour through the /proc hierarchy.
[number]
There is a numerical subdirectory for each running process; the
subdirectory is named by the process ID. Each contains the following
pseudo-files and directories.
cmdline
This holds the complete command line for the process, unless the whole
process has been swapped out, or unless the process is a zombie. In
either of these later cases, there is nothing in this file: i.e. a
read on this file will return as having read 0 characters. This file
is null-terminated, but not newline-terminated.
cwd
This is a link to the current working directory of the process. To find out
the cwd of process 20, for instance, you can do this:
cd /proc/20/cwd; /bin/pwd
Note that the pwd command is often a shell builtin, and might
not work properly in this context.
environ
This file contains the environment for the process.
The entries are separated by null characters,
and there may be a null character at the end.
Thus, to print out the environment of process 1, you would do:
(cat /proc/1/environ; echo) | tr "\000" "\n"
(For a reason why one should want to do this, see
lilo(8).)
exe
a pointer to the binary which was executed, and appears as a symbolic
link. A
readlink(2)
call on the
exe
special file returns under Linux 2.0 and earlier a string in the format:
[device]:inode
For example, [0301]:1502 would be inode 1502 on device major 03 (IDE,
MFM, etc. drives) minor 01 (first partition on the first drive).
Under Linux 2.2 the link contains the actual path name of the
command.
Also, the symbolic link can be dereferenced normally - attempting to
open "exe" will open the executable. You can even type
/proc/[number]/exe
to run another copy of the same process as [number].
find(1)
with the -inum option can be used to locate the file.
fd
This is a subdirectory containing one entry for each file which the
process has open, named by its file descriptor, and which is a
symbolic link to the actual file (as the exe entry does). Thus, 0 is
standard input, 1 standard output, 2 standard error, etc.
Programs that will take a filename, but will not take the standard
input, and which write to a file, but will not send their output to
standard output, can be effectively foiled this way, assuming that -i
is the flag designating an input file and -o is the flag designating
an output file:
foobar -i /proc/self/fd/0 -o /proc/self/fd/1 ...
and you have a working filter. Note that this will not work for
programs that seek on their files, as the files in the fd directory
are not seekable.
/proc/self/fd/N is approximately the same as /dev/fd/N in some UNIX
and UNIX-like systems. Most Linux MAKEDEV scripts symbolically link
/dev/fd to [..]/proc/self/fd, in fact.
maps
A file containing the currently mapped memory regions and their access
permissions.
where address is the address space in the process that it occupies,
perms is a set of permissions:
r = read
w = write
x = execute
s = shared
p = private (copy on write)
offset is the offset into the file/whatever, dev is the device
(major:minor), and inode is the inode on that device. 0 indicates
that no inode is associated with the memory region, as the case would
be with bss.
Under Linux 2.2 there is an additional field giving a pathname
where applicable.
mem
This is not the same as the mem (1:1) device, despite the fact that it
has the same device numbers. The /dev/mem device is the physical
memory before any address translation is done, but the mem file here
is the memory of the process that accesses it. This cannot be
mmap(2)
'ed currently, and will not be until a general
mmap(2)
is added to the kernel. (This might have happened by the time you read this.)
mmap
Directory of maps by
mmap(2)
which are symbolic links like exe, fd/*, etc. Note that maps includes
a superset of this information, so /proc/*/mmap should be considered
obsolete.
"0" is usually libc.so.4.
/proc/*/mmap
was removed in Linux kernel version 1.1.40. (It really
was
obsolete!)
root
Unix and linux support the idea of a per-process root of the
filesystem, set by the
chroot(2)
system call. Root points to the file system root, and behaves as exe,
fd/*, etc. do.
stat
Status information about the process. This is used by
ps(1)
.
The fields, in order, with their proper
scanf(3)
format specifiers, are:
pid %d
The process id.
comm %s
The filename of the executable, in parentheses. This is visible
whether or not the executable is swapped out.
state %c
One character from the string "RSDZT" where R is running, S is
sleeping in an interruptible wait, D is sleeping in an uninterruptible
wait or swapping, Z is zombie, and T is traced or stopped (on a
signal).
ppid %d
The PID of the parent.
pgrp %d
The process group ID of the process.
session %d
The session ID of the process.
tty %d
The tty the process uses.
tpgid %d
The process group ID of the process which currently owns the tty that
the process is connected to.
flags %u
The flags of the process. Currently, every flag has the math bit set,
because crt0.s checks for math emulation, so this is not included in
the output. This is probably a bug, as not every process is a
compiled C program. The math bit should be a decimal 4, and the
traced bit is decimal 10.
minflt %u
The number of minor faults the process has made, those which have not
required loading a memory page from disk.
cminflt %u
The number of minor faults that the process and its children have
made.
majflt %u
The number of major faults the process has made, those which have
required loading a memory page from disk.
cmajflt %u
The number of major faults that the process and its children have
made.
utime %d
The number of jiffies that this process has been scheduled in user
mode.
stime %d
The number of jiffies that this process has been scheduled in kernel
mode.
cutime %d
The number of jiffies that this process and its children have been
scheduled in user mode.
cstime %d
The number of jiffies that this process and its children have been
scheduled in kernel mode.
counter %d
The current maximum size in jiffies of the process's next timeslice,
or what is currently left of its current timeslice, if it is the
currently running process.
priority %d
The standard nice value, plus fifteen. The value is never negative in
the kernel.
timeout %u
The time in jiffies of the process's next timeout.
itrealvalue %u
The time (in jiffies) before the next SIGALRM is sent to the process
due to an interval timer.
starttime %d
Time the process started in jiffies after system boot.
vsize %u
Virtual memory size
rss %u
Resident Set Size: number of pages the process has in real memory,
minus 3 for administrative purposes. This is just the pages which
count towards text, data, or stack space. This does not include pages
which have not been demand-loaded in, or which are swapped out.
rlim %u
Current limit in bytes on the rss of the process (usually
2,147,483,647).
startcode %u
The address above which program text can run.
endcode %u
The address below which program text can run.
startstack %u
The address of the start of the stack.
kstkesp %u
The current value of esp (32-bit stack pointer), as found in the
kernel stack page for the process.
kstkeip %u
The current EIP (32-bit instruction pointer).
signal %d
The bitmap of pending signals (usually 0).
blocked %d
The bitmap of blocked signals (usually 0, 2 for shells).
sigignore %d
The bitmap of ignored signals.
sigcatch %d
The bitmap of catched signals.
wchan %u
This is the "channel" in which the process is waiting. This is the
address of a system call, and can be looked up in a namelist if you
need a textual name. (If you have an up-to-date /etc/psdatabase, then
try ps -l to see the WCHAN field in action)
cpuinfo
This is a collection of CPU and system architecture dependent items,
for each supported architecture a different list.
The only two common entries are cpu which is (guess what) the CPU
currently in use and BogoMIPS a system constant which is calculated
during kernel initialization.
devices
Text listing of major numbers and device groups. This can be used by
MAKEDEV scripts for consistency with the kernel.
dma
This is a list of the registered ISA DMA (direct memory access)
channels in use.
filesystems
A text listing of the filesystems which were compiled into the kernel.
Incidentally, this is used by
mount(1)
to cycle through different filesystems when none is specified.
interrupts
This is used to record the number of interrupts per each IRQ on (at
least) the i386 architechure. Very easy to read formatting, done in
ASCII.
ioports
This is a list of currently registered Input-Output port regions that
are in use.
kcore
This file represents the physical memory of the system and is stored
in the core file format. With this pseudo-file, and an unstripped
kernel (/usr/src/linux/tools/zSystem) binary, GDB can be used to
examine the current state of any kernel data structures.
The total length of the file is the size of physical memory (RAM) plus
4KB.
kmsg
This file can be used instead of the
syslog(2)
system call to log kernel messages. A process must have superuser
privileges to read this file, and only one process should read this
file. This file should not be read if a syslog process is running
which uses the
syslog(2)
system call facility to log kernel messages.
Information in this file is retrieved with the
dmesg(8)
program).
ksyms
This holds the kernel exported symbol definitions used by the
modules(X)
tools to dynamically link and bind loadable modules.
loadavg
The load average numbers give the number of jobs in the run queue (state R)
or waiting for disk I/O (state D) averaged over 1, 5 and 15 minutes.
They are the same as the load average numbers given by
uptime(1)
and other programs.
locks
This file shows current file locks.
locks
This file shows current file locks.
malloc
This file is only present if CONFIGDEBUGMALLOC was defined during
compilation.
meminfo
This is used by
free(1)
to report the amount of free and used memory (both physical and swap)
on the system as well as the shared memory and buffers used by the
kernel.
It is in the same format as
free(1)
, except in bytes rather than KB.
modules
A text list of the modules that have been loaded by the system.
net
various net pseudo-files, all of which give the status of some part of
the networking layer. These files contain ASCII structures, and are
therefore readable with cat. However, the standard
netstat(8)
suite provides much cleaner access to these files.
arp
This holds an ASCII readable dump of the kernel ARP table used for
address resolutions. It will show both dynamically learned and
pre-programmed ARP entries. The format is:
IP address HW type Flags HW address
10.11.100.129 0x1 0x6 00:20:8A:00:0C:5A
10.11.100.5 0x1 0x2 00:C0:EA:00:00:4E
44.131.10.6 0x3 0x2 GW4PTS
Where 'IP address' is the IPv4 address of the machine, the 'HW type' is the
hardware type of the address from RFC 826. The flags are the internal flags
of the ARP structure (as defined in /usr/include/linux/if_arp.h) and the 'HW
address' is the physical layer mapping for that IP address if it is known.
dev
The dev pseudo-file contains network device status information. This gives
the number of received and sent packets, the number of errors and collisions
and other basic statistics. These are used by the
ifconfig(8)
program to report device status. The format is:
This file uses the same format as the
arp
file and contains the current reverse mapping database used to provide
rarp(8)
reverse address lookup services. If RARP is not configured into the kernel
this file will not be present.
raw
Holds a dump of the RAW socket table. Much of the information is not of use
apart from debugging. The 'sl' value is the kernel hash slot for the socket,
the 'local address' is the local address and protocol number pair."St" is
the internal status of the socket. The "tx_queue" and "rx_queue" are the
outgoing and incoming data queue in terms of kernel memory usage. The "tr",
"tm->when" and "rexmits" fields are not used by RAW. The uid field holds the
creator euid of the socket.
This file holds the ASCII data needed for the IP, ICMP, TCP and UDP management
information bases for an snmp agent. As of writing the TCP mib is
incomplete. It is hoped to have it completed by 1.2.0.
tcp
Holds a dump of the TCP socket table. Much of the information is not of use
apart from debugging. The "sl" value is the kernel hash slot for the socket,
the "local address" is the local address and port number pair. The "remote
address" is the remote address and port number pair (if connected). 'St' is
the internal status of the socket. The 'tx_queue' and 'rx_queue' are the
outgoing and incoming data queue in terms of kernel memory usage. The "tr",
"tm->when" and "rexmits" fields hold internal information of the kernel
socket state and are only useful for debugging. The uid field holds the
creator euid of the socket.
udp
Holds a dump of the UDP socket table. Much of the information is not of use
apart from debugging. The "sl" value is the kernel hash slot for the socket,
the "local address" is the local address and port number pair. The "remote
address" is the remote address and port number pair (if connected). "St" is
the internal status of the socket. The "tx_queue" and "rx_queue" are the
outgoing and incoming data queue in terms of kernel memory usage. The "tr",
"tm->when" and "rexmits" fields are not used by UDP. The uid field holds the
creator euid of the socket. The format is:
Lists the UNIX domain sockets present within the system and their
status. The format is:
Num RefCount Protocol Flags Type St Path
0: 00000002 00000000 00000000 0001 03
1: 00000001 00000000 00010000 0001 01 /dev/printer
Where 'Num' is the kernel table slot number, 'RefCount' is the number
of users of the socket, 'Protocol' is currently always 0, 'Flags'
represent the internal kernel flags holding the status of the
socket. Type is always '1' currently (Unix domain datagram sockets are
not yet supported in the kernel). 'St' is the internal state of the
socket and Path is the bound path (if any) of the socket.
pci
This is a listing of all PCI devices found during kernel initialization
and their configuration.
scsi
A directory with the scsi midlevel pseudo-file and various SCSI lowlevel driver
directories, which contain a file for each SCSI host in this system, all of
which give the status of some part of the SCSI IO subsystem.
These files contain ASCII structures, and are therefore readable with cat.
You can also write to some of the files to reconfigure the subsystem or switch
certain features on or off.
scsi
This is a listing of all SCSI devices known to the kernel. The listing is
similar to the one seen during bootup.
scsi currently supports only the add-single-device command which allows
root to add a hotplugged device to the list of known devices.
An
echo 'scsi add-single-device 1 0 5 0' > /proc/scsi/scsi
will cause
host scsi1 to scan on SCSI channel 0 for a device on ID 5 LUN 0. If there
is already a device known on this address or the address is invalid an
error will be returned.
drivername
drivername can currently be: NCR53c7xx, aha152x, aha1542, aha1740,
aic7xxx, buslogic, eata_dma, eata_pio, fdomain, in2000, pas16, qlogic,
scsi_debug, seagate, t128, u15-24f, ultrastore or wd7000.
These directories show up for all drivers which registered at least one SCSI
HBA. Every directory contains one file per registered host. Every host-file is
named after the number the host got assigned during initilization.
Reading these files will usually show driver and host configuration,
statistics etc.
Writing to these files allows different things on different hosts. For example
with the latency and nolatency commands root can switch on and off
command latency measurement code in the eata_dma driver. With the lockup
and unlock commands root can control bus lockups simulated by the
scsi_debug driver.
self
This directory refers to the process accessing the /proc filesystem,
and is identical to the /proc directory named by the process ID of the
same process.
stat
kernel/system statistics
cpu 3357 0 4313 1362393
The number of jiffies (1/100ths of a second) that the system spent in
user mode, user mode with low priority (nice), system mode, and the
idle task, respectively. The last value should be 100 times the
second entry in the uptime pseudo-file.
disk 0 0 0 0
The four disk entries are not implemented at this time. I'm not even
sure what this should be, since kernel statistics on other machines
usually track both transfer rate and I/Os per second and this only
allows for one field per drive.
page 5741 1808
The number of pages the system paged in and the number that were paged
out (from disk).
swap 1 0
The number of swap pages that have been brought in and out.
intr 1462898
The number of interrupts received from the system boot.
ctxt 115315
The number of context switches that the system underwent.
btime 769041601
boot time, in seconds since the epoch (January 1, 1970).
sys
This directory (present since 1.3.57) contains a number of files
and subdirectories corresponding to kernel variables.
These variables can be read and sometimes modified using
the proc file system, and using the
sysctl(2)
system call. Presently, there are subdirectories
kernel, net, vm
that each contain more files and subdirectories.
kernel
This contains files
domainname, file-max, file-nr, hostname,
inode-max, inode-nr, osrelease, ostype,
panic, real-root-dev, securelevel, version,
with function fairly clear from the name.
The (read-only) file
file-nr
gives the number of files presently opened.
The file
file-max
gives the maximum number of open files the kernel is willing
to handle. If 1024 is not enough for you, try
echo 4096 > /proc/sys/kernel/file-max
Similarly, the files
inode-nr
and
inode-max
indicate the present and the maximum number of inodes.
The files
ostype, osrelease, version
give substrings of
/proc/version.
The file
panic
gives r/w access to the kernel variable
panic_timeout.
If this is zero, the kernel will loop on a panic; if nonzero
it indicates that the kernel should autoreboot after this number
of seconds.
The file
securelevel
seems rather meaningless at present - root is just too powerful.
uptime
This file contains two numbers: the uptime of the system (seconds),
and the amount of time spent in idle process (seconds).
version
This strings identifies the kernel version that is currently running.
For instance:
Linux version 1.0.9 (quinlan@phaze) #1 Sat May 14 01:51:54 EDT 1994
This roughly conforms to a Linux 1.3.11 kernel. Please update this as
necessary!
Last updated for Linux 1.3.11.
CAVEATS
Note that many strings (i.e., the environment and command line) are in
the internal format, with sub-fields terminated by NUL bytes, so you
may find that things are more readable if you use od -c or tr
"\000" "\n" to read them.
This manual page is incomplete, possibly inaccurate, and is the kind
of thing that needs to be updated very often.
BUGS
The
/proc
file system may introduce security holes into processes running with
chroot(2).
For example, if
/proc
is mounted in the
chroot
hierarchy, a
chdir(2)
to
/proc/1/root
will return to the original root of the file system. This may be
considered a feature instead of a bug, since Linux does not yet support the
fchroot(2)
call.