The most common datafile modifier is `using`.
Syntax:
plot 'file' using {<entry> {:<entry> {:<entry> ...}}} {'format'}
If a format is specified, each datafile record is read using the C library's
'scanf' function, with the specified format string. Otherwise the record is
read and broken into columns at spaces or tabs. A format cannot be specified
if time-format data is being used (this must be done by `set data time`).
The resulting array of data is then sorted into columns according to the
entries. Each <entry> may be a simple column number, which selects the
datum, an expression enclosed in parentheses, or empty. The expression can
use $1 to access the first item read, $2 for the second item, and so on. It
can also use `column(x)` and `valid(x)` where x is an arbitrary expression
resulting in an integer. `column(x)` returns the x'th datum; `valid(x)`
tests that the datum in the x'th column is a valid number. A column number
of 0 generates a number increasing (from zero) with each point, and is reset
upon encountering two blank records. A column number of -1 gives the
dataline number, which starts at 0, increments at single blank records, and
is reset at double blank records. A column number of -2 gives the index
number, which is incremented only when two blank records are found. An empty
<entry> will default to its order in the list of entries. For example,
`using ::4` is interpreted as `using 1:2:4`.
N.B.---the `call` (Note:call ) command also uses $'s as a special
character. See call for details about how to include a column number in a
`call` argument list.
If the `using` list has but a single entry, that <entry> will be used for y
and the data point number is used for x; for example, "`plot 'file' using 1`"
is identical to "`plot 'file' using 0:1`". If the `using` list has two
entries, these will be used for x and y. Additional entries are usually
errors in x and/or y. See `set style` (Note:style ) for details about
plotting styles that make use of error information, and `fit` (Note:fit )
for use of error information in curve fitting.
'scanf' accepts several numerical specifications but `gnuplot`
(Note:gnuplot ) requires all inputs to be double-precision floating-point
variables, so `lf` is the only permissible specifier. 'scanf' expects to see
white space---a blank, tab ("\t"), newline ("\n"), or formfeed
("\f")---between numbers; anything else in the input stream must be explicitly
skipped.
Note that the use of "\t", "\n", or "\f" or requires use of double-quotes
rather than single-quotes.
Examples:
This creates a plot of the sum of the 2nd and 3rd data against the first:
(The format string specifies comma- rather than space-separated columns.)
plot 'file' using 1:($2+$3) '%lf,%lf,%lf'
In this example the data are read from the file "MyData" using a more
complicated format:
plot 'MyData' using "%*lf%lf%*20[^\n]%lf"
The meaning of this format is:
%*lf ignore a number
%lf read a double-precision number (x by default)
%*20[^\n] ignore 20 non-newline characters
%lf read a double-precision number (y by default)
One trick is to use the ternary `?:` operator to filter data:
plot 'file' using 1:($3>10 ? $2 : 1/0)
which plots the datum in column two against that in column one provided
the datum in column three exceeds ten. `1/0` is undefined; `gnuplot`
quietly ignores undefined points, so unsuitable points are suppressed.
In fact, you can use a constant expression for the column number, provided it
doesn't start with an opening parenthesis; constructs like `using
0+(complicated expression)` can be used. The crucial point is that the
expression is evaluated once if it doesn't start with a left parenthesis, or
once for each data point read if it does.
If timeseries data are being used, the time can span multiple columns. The
starting column should be specified. Note that the spaces within the time
must be included when calculating starting columns for other data. E.g., if
the first element on a line is a time with an embedded space, the y value
should be specified as column three.
It should be noted that `plot 'file'`, `plot 'file' using 1:2`, and `plot
(Note:plot ) 'file' using ($1):($2)` can be subtly different: 1) if `file`
has some lines with one column and some with two, the first will invent x
values when they are missing, the second will quietly ignore the lines with
one column, and the third will store an undefined value for lines with one
point (so that in a plot with lines, no line joins points across the bad
point); 2) if a line contains text at the first column, the first will abort
the plot on an error, but the second and third should quietly skip the
garbage.
In fact, it is often possible to plot a file with lots of lines of garbage at
the top simply by specifying
plot 'file' using 1:2
However, if you want to leave text in your data files, it is safer to put the
comment character (#) in the first column of the text lines.