Readme for analog -- configuring the output
[ Top | Up |
Prev | Next |
Map | Index ]
Analog 5.23:
Configuring the output
So far we have mainly discussed commands which control how analog reads
the logfiles. We now get on to commands for configuring the output.
First, you can change the style of the output using the
OUTPUT command. There
are five possible output styles, called HTML, PLAIN,
ASCII, LATEX and COMPUTER.
HTML produces web pages.
PLAIN produces plain text files, and ASCII is the same
as PLAIN except that it uses all ASCII characters (no accents etc.)
if possible. (This is because some applications don't understand accented
characters - for example, they're not always reliable over email).
LATEX produces LaTeX code which can be turned into PDF if you have
the pdflatex command installed. (If you want to use the ordinary
latex command, specify PDFLATEX OFF.) It's only
available with certain European languages (US-ASCII, ISO-8859-1 and ISO-8859-2
character sets). Yes, I know it gives overfull hboxes sometimes.
COMPUTER is a special format suitable for reading by a
computer (useful for reading into a spreadsheet, or post-processing with a
graphics package, for example).
There is a separate section about this
format later.
As well as a command like
OUTPUT PLAIN
you can also select PLAIN style with the command line argument
+a, and HTML with the command line argument
-a.
You can also specify OUTPUT NONE
for no output, if you are producing a cache file.
Next, you can change the language of the output. There
are two ways to do
this. The usual way is to use the LANGUAGE command. For example,
the command
LANGUAGE FRENCH
will give you the output in French. The available languages at the moment are
ARMENIAN, BULGARIAN (Windows-1251),
BULGARIAN-MIK (MIK-16), CATALAN,
TRAD-CHINESE (Big5),
CZECH (ISO Latin 2), CZECH-1250 (Windows-1250),
DANISH, DUTCH, ENGLISH,
US-ENGLISH, FINNISH, FRENCH,
GERMAN, HUNGARIAN, ITALIAN,
JAPANESE-EUC (EUC-JP), JAPANESE-JIS (ISO-2022-JP),
JAPANESE-SJIS (SJIS), JAPANESE-UTF (UTF-8),
KOREAN, LATVIAN, NORWEGIAN (Bokmål),
NYNORSK, POLISH, PORTUGUESE,
BR-PORTUGUESE, RUSSIAN (KOI8-R),
RUSSIAN-1251 (Windows-1251), SERBIAN,
SLOVENE (ISO Latin 2),
SLOVENE-1250 (Windows-1250), SPANISH, SWEDISH,
SWEDISH-ALT (alternative translation avoiding Anglicisms),
TURKISH and UKRAINIAN.
The following languages were available for previous versions of analog, but
have not yet been translated for version 5:
BOSNIAN, SIMP-CHINESE (GB2312),
CROATIAN, GREEK,
ICELANDIC, LITHUANIAN,
ROMANIAN and SLOVAK.
I hope that they will be available
soon, and as soon as they are, they will be added to the
analog home page.
The other way to specify a language is to use the LANGFILE
command. This is useful if you want to download a new language from the
analog home page, or
if you want to translate one yourself, or even if you want to change some
words or phrases or the way the dates and times are formatted in the output.
The LANGFILE command tells analog in which file to find the various
words and phrases for a new language. For example, the command
LANGFILE guarani.lng # or
LANGFILE /usr/etc/httpd/analog/lang/guarani.lng
would read from that file.
If the name of the file doesn't include a directory, it will be
looked for wherever analog normally expects to find its language files.
Some languages also have domains files or
report descriptions files available. These are normally selected automatically
by the LANGUAGE command. But you can tell analog to use different
ones with the DOMAINSFILE and
DESCFILE commands. Also, some languages
have translations of the form interface or
configuration file.
If you want to translate another language, I would be delighted! Do
contact me first to make sure that no-one else is
already translating the
same language. The file README.txt in the language directory, and
the English language file, contain some brief instructions for translating new
languages.
Equally, if you find any mistakes in the output in different languages, please
do let me know because I'm not able to check them
all myself!
or with a command line argument like +Ostats.htm. If you use the
filename - or stdout, the output will go to standard
output, which is normally the screen, but Unix users might like to redirect it
to another file or even into a pipe. You can also use an absolute path name,
like
OUTFILE /usr/bin/httpd/htdocs/stats.html # Unix
OUTFILE "Hard Disk:Server Apps:WebSTAR:Analog:Report.html" # Mac
If the name of the OUTFILE doesn't include a directory, it will be
put wherever analog expects to put its output files. (This location is built
in when the program is compiled.) For example, on Windows it would be in the
same folder as the analog executable. But if you use the +O command
line argument, the file is within the current directory.
You can include date codes in the OUTFILE in exactly the same way
as for the LOGFILE.
So for example,
OUTFILE stats%y%M%D.html
will produce filenames like stats990501.html. As with the
LOGFILE, the date used is the TO date if one was
specified, and otherwise the time of the start of the program.
Next, you need to know how to turn the different reports on
and off. There are 44 different reports which analog can produce,
if your web server has been configured to record the necessary data in the
logfiles. Each one has a short name, and a code letter or number, as
follows. (Note that the code letters are case sensitive:
Z is quite different from z, for example).
x GENERAL General Summary
1 YEARLY Yearly Report
Q QUARTERLY Quarterly Report
m MONTHLY Monthly Report
W WEEKLY Weekly Report
D DAILYREP Daily Report
d DAILYSUM Daily Summary
H HOURLYREP Hourly Report
h HOURLYSUM Hourly Summary
w WEEKHOUR Hour of the Week Summary
4 QUARTERREP Quarter-Hour Report
6 QUARTERSUM Quarter-Hour Summary
5 FIVEREP Five-Minute Report
7 FIVESUM Five-Minute Summary
S HOST Host Report
l REDIRHOST Host Redirection Report
L FAILHOST Host Failure Report
Z ORGANISATION Organisation Report
o DOMAIN Domain Report
r REQUEST Request Report
i DIRECTORY Directory Report
t FILETYPE File Type Report
z SIZE File Size Report
P PROCTIME Processing Time Report
E REDIR Redirection Report
I FAILURE Failure Report
f REFERRER Referrer Report
s REFSITE Referring Site Report
N SEARCHQUERY Search Query Report
n SEARCHWORD Search Word Report
Y INTSEARCHQUERY Internal Search Query Report
y INTSEARCHWORD Internal Search Word Report
k REDIRREF Redirected Referrer Report
K FAILREF Failed Referrer Report
B BROWSERREP Browser Report
b BROWSERSUM Browser Summary
p OSREP Operating System Report
v VHOST Virtual Host Report
R REDIRVHOST Virtual Host Redirection Report
M FAILVHOST Virtual Host Failure Report
u USER User Report
j REDIRUSER User Redirection Report
J FAILUSER User Failure Report
c STATUS Status Code Report
For details on what the various reports mean, and a summary of the commands
which control them, see the section on
Analog's reports.
or by using command line arguments like -5 and +s.
You can also turn all reports except the General Summary on or off with the
commands ALL ON and ALL OFF, or with the command line
arguments +A and -A.
You can turn the descriptions of each report off
with the command
DESCRIPTIONS OFF
Even if DESCRIPTIONS is ON, the descriptions will only
appear if analog can find a report descriptions file in your language, or if
you specify one using the DESCFILE command: for example,
DESCFILE descriptions.txt
If the name of the descriptions file doesn't include a directory, it will be
looked for wherever analog normally expects to find its language files.
GOTOS ON turns them on again, and GOTOS FEW puts the
"Go To" lines just at the top and bottom. GOTOS OFF can
be abbreviated with the -X command line argument, and
GOTOS ON with +X.
You can turn off the "Program started at" line
at the top of the output, and the "Running Time" line at the
bottom, with the command
RUNTIME OFF
and turn them on again with RUNTIME ON.
The figures in parentheses in the General Summary are
for the last seven days:
either the seven days before the TO time, or if no TO
time is given, the seven days before the time of the program start. The
figures for the last seven days are normally included if some, but not all,
of the requests fall in those seven days; but you can turn them off by means
of the command
LASTSEVEN OFF
Of course LASTSEVEN ON turns them on again.
You can change the order of the reports by means of
the REPORTORDER command. You should list the
code letters for all possible reports in the order
you want them. Non-alphanumeric characters are ignored and so can be used as
separators. For example,
You can turn the lines in General Summary on and off
individually using the GENSUMLINES command. The default is
GENSUMLINES ALL
meaning all available lines. You can turn lines off using a command like
GENSUMLINES -KL
(to turn off lines K & L) and turn them on again
with a command like
GENSUMLINES +K
You can specify the exact set of lines to include with a command like
GENSUMLINES CDFGHM
You now just need to know which lines have which code letters, which is given
in the following table.
Successful requests (always listed)
B
Average successful requests per day
C
Logfile lines without status code
D
Successful requests for pages
E
Average successful requests for pages per day
F
Failed requests
G
Redirected requests
H
Requests with informational status code
I
Distinct files requested
J
Distinct hosts served
K
Corrupt logfile lines
L
Unwanted logfile entries
M
Data transferred
N
Average data transferred per day
There is a command called IMAGEDIR
which tells analog where the various images used to make the output page should
live. It should be a URL, not the actual location on your disk, and it should
include the final slash. For example, you could have
IMAGEDIR img/ # relative URL: within the same directory as the output
IMAGEDIR /img/ # off the root directory of your server
IMAGEDIR http://www.myother.server.com/img/ # on another server
Some people are confused about the IMAGEDIR. It's just put in the
<img> tags in the output. You can see its effect if you look at the HTML
source of the output page.
This is off by default because browser support for png's is still
disappointingly weak, so it produces worse output on many browsers. This
decision may change in the future though. PNGIMAGES doesn't affect
the pie charts, which are always png's: but see the
JPEGCHARTS command for something
similar.
There are three commands which affect the top line of the
output. First,
the LOGO command allows you to replace the analog logo with
another image (for example, your organisation's logo). You can say
LOGO picture.gif # for this file
LOGO /images/picture2.gif # a different file
LOGO none # for no logo
The logo is assumed to be inside the IMAGEDIR unless it starts
with a slash, or contains ://
There are commands HOSTNAME and
HOSTURL which
affect the name and link at the end of the title line. For example, I might
specify
to generate the title "Web Server Statistics for
Stephen Turner".
Again, you can use none as the HOSTURL to specify no
link. Analog will normally translate characters in the hostname to HTML if
necessary. So to include literal HTML, such as accented characters, in the
output you need to precede them by a backslash, like this:
HOSTNAME "M\üller & S\öhne"
There are commands called HEADERFILE and
FOOTERFILE.
These let you specify files to be inserted near the top and bottom of your
output. You can also specify
HEADERFILE none
to cancel a previously-specified header file.
Again, if the name of the HEADERFILE or FOOTERFILE
doesn't include a directory, analog will assume a directory, specified
when the program was compiled.
There is a command called STYLESHEET to
specify a style sheet for the output. This allows you to specify colours etc.
(See http://www.w3.org/Style/css/
for how to write a style sheet.) For example,
STYLESHEET /housestyle.css
STYLESHEET none # to cancel it
Hint: a common mistake in writing style sheets is to declare a font-family
for the body, but then not put <pre> sections back into a monospaced
font. This stops the columns lining up properly. Your style sheet should
contain a line like the following:
There are three related commands called
SEPCHAR,
REPSEPCHAR and DECPOINT. These specify single characters
to be used as the thousands separator in numbers, the thousands separator
within the columns in the reports, and the decimal point. Normally, these will
be set automatically for the language you choose, but
you can change them if you want. For example, a French user might choose
SEPCHAR " "
REPSEPCHAR none
DECPOINT ,
to make "three thousand and a quarter" look like
"3 000,25" in text and "3000,25" in the reports.
There is a command called RAWBYTES. Specify
RAWBYTES ON
if you want the exact number of bytes to be listed, or
RAWBYTES OFF if you want the number of kilobytes or Megabytes
as appropriate to be listed instead.
There are commands called
HTMLPAGEWIDTH, PLAINPAGEWIDTH and
LATEXPLAINWIDTH which specify the
width of the page. Which one is used depends on whethere the output style is
HTML, PLAIN (including ASCII), or
LATEX. The output is not guaranteed to fit in this width, but
analog will take notice of it when choosing the width of the time graphs,
when sorting the Host Report alphabetically, when drawing horizontal rules,
and when writing some bits of text.
There is a command called NOROBOTS which
stops robots which obey the
robots META tag
from indexing your output page or following its links. Normally this is set to
ON but you can specify NOROBOTS OFF if you don't mind
robots finding your other pages this way. Note that you will stop far more
robots if you also put your stats page in your
robots.txt
file; on the other hand, this file has to be kept up to date by the server
administrator.
Sometimes your server is not in the same timezone as
you, or at least records the times in its logfiles in a different timezone
(for example GMT). So that you can get your
statistics in your local time, there is a command called
LOGTIMEOFFSET to change the time by a certain number of minutes. As
with the LOGFORMAT command, this only
affects logfiles which come later in the same configuration
file.
You have to be careful using this command. Because of
daylight savings time in operation in different parts of the world at
different times, analog cannot attempt to convert between different
timezones. So it's your responsibility to set the right offset for different
times of year. For example, if you were in Chicago, but your server was
recording time in GMT, you would need to specify two different time offsets,
one of minus five hours for summer and one of minus six hours for winter. You
would need to split your logfiles in the right places and then run commands
like
There is also a related command called TIMEOFFSET. This tells
analog how much to offset the time of the computer on which it is running
(rather than the computer running the server), to get your local time.