You can define server-specific configuration settings for http-analyze
in an analyzer configuration file. To have the analyzer use such a
configuration file, specify its name with the options
-c cfgfile or the environment variable HA_CONFIG.
Note that command line options always take preceedence over settings in a
configuration file.
If the option -i newcfg is specified, http-analyze
creates a configuration template in the file newcfg.
Any other command line options present will be transformed into its
appropriate definitions in the new configuration file. The settings
then can be customized further by manually editing the configuration
definitions using a standard text editor.
To update an old configuration file into a new format, specify its name using
the option -c in addition to -i. This will instruct the analyzer
to retain any settings from the old file.
The configuration file contains a single directive per line. Except for
IndexFiles, PageView, AddDomain, VirtualNames,
Ign*, and Hide*, each directive may appear only once in the
configuration file. Following a directive field there are one or two value
fields, which must be separated from the directive and each other by one
or more tabulators. Blanks are considered part of the string in an optional
third field only. All directive names are case-insensitive. Comment lines
starting with a hash character (#) are ignored.
- 3DWinSize widthxheight
- Defines the size of the 3D window (default: 520x420 pixels). Example:
- 3DWinSize 540x450
- 3DWindow keyword
- Defines the 3D window the VRML model is displayed in (same as option
-W). The keyword may be either extern (default) or
intern for display of the VRML model in a new, external window or
in the lower half of the main frame respectively. Example:
- 3DWindow intern
- AddDomain domain string
- Add entries to the domain table causing certain domains to be
allocated to the mock domain string. Wildcards in domain
are ignored. This directive is useful to collect certain hostnames (for
example the hosts of world-wide operating online services), under some
string (item) instead of the country associated with the top-level-domain.
Example:
- AddDomain .compuserve.com CompuServe
- AuthURL boolean_value
- Defines whether accesses which required authentication should be skipped.
By default, such URLs appear in the report just like ordinary URLs. If
AuthURL is set to Off, No,
None, False, or 0 the analyzer skips authenticated
requests in the logfile, so that they will be suppressed from the
statistics report. Example:
- AuthURL No
- BtnSymlink boolean_value
- Creates symbolic links to the required buttons and files in
HA_LIBDIR instead of copying them into the output directory.
If you are going to analyze a large number of virtual servers which
reside on the same host, you can probably save disk space by avoiding
copies of all buttons and files into each output directory. Note that
this directive can be used only on systems which support symbolic links.
Example:
- BtnSymlink Yes
- CustLogoW image srvurl and
CustLogoB image srvurl
- Defines images for use as customer logos in the statistics report.
This feature is available only in the commercial version of the analyzer.
image is the name of the image file relative to the output directory
OutputDir and srvurl is the URL to
be followed if the user
clicks on the image. To use your own logos create two images - one for use
on white backgrounds (CustLogoW) and another
one for use on black backgrounds (CustLogoB).
The images should be approximately 72x72
pixels in size and must be placed into the buttons subdirectory of the
central libdir (HA_LIBDIR/btn). Next time a report is
generated, the analyzer copies those logos into the output directory and
includes them in the report. Example:
- CustLogoW btn/mycompany_sw.png http://www.mycompany.com/
- CustLogoB btn/mycompany_sb.png http://www.mycompany.com/
- DefaultMode mode
- The default operation mode of http-analyze.
The value field contains either the keyword daily for short
statistics mode or monthly for full statistics mode (see also
options -d and -m). If left undefined, the default is
full statistics mode (monthly). Example:
- DefaultMode daily
- DocRoot docroot
- Restricts logfile analysis to the given Document Root (same as option
-R). If docroot is prefixed by a `!', analysis
takes place for all directories except docroot. If docroot
does not start with a slash (`/'), it is interpreted as the
name of a virtual server, which is matched against the normally unused
second field of a logfile entry. Useful for virtual servers with a
separate Document Root. Note: Do not define this directive to analyze
the whole server. Explicitely setting DocRoot to `/' (the
default) only increases processing time. Example:
- DocRoot /customer/ DocRoot www.customer.com
- HTMLCharSet chrset
- Force use of chrset for the browser's encoding when displaying
the statistics report (same as option -C). This is needed for
languages which require special character sets such as Chinese. See also
the section about Multi-National Language Support.
Example:
- HTMLCharSet iso-8859-1
- HTMLPrefix prefix and
HTMLTrailer trailer
- The HTML prefix and trailer to be inserted into the
statistics output files at the top and bottom of the page. If defined, the
HTMLPrefix string must include the
<BODY> tag.
To read the HTML code from a file, specify its name as the prefix
or trailer. Example:
- HTMLPrefix <BODY BGCOLOR="#FF0000">
- HTMLTrailer <A HREF="/intern/">Back</A> to the internal page.
- HeadFont fontlist, TextFont
fontlist and ListFont fontlist
- The fonts to use for headers, for regular text, and for the detailed
lists. If unset, the analyzer uses a list of common serif-less fonts for
headers and regular text and a monospaced (fixed) font for the detailed
lists. To force the navigator's default for fonts, use the keyword
default as the fontname. Example:
- HeadFont Helvetica,Arial,Geneva,sans-serif
- TextFont Helvetica,Arial,Geneva,sans-serif
- ListFont Courier,fixed
- HeadSize size, TextSize size,
SmallSize size and ListSize size
- The font sizes for headings (navigator default, usually 3),
regular text (default: 2), small text (default: 1) and lists (default: 2).
TextSize replaces the former FontSize, which is still
recognized for backward compatibility with older configuration files.
Example:
- HeadSize 4
- TextSize 3
- SmallSize 2
- HideAgent agent string
- Hide a browser type under an arbitrary string (item).
Needed only for a certain browser whose vendor still can't spell
its name correctly. Only the leading part of the browser type is
compared against agent, so no wildcards are needed in the
second field. Example:
- HideAgent Mozilla/4.0 (compatible; MSIE 4. MSIE 4.*
- HideAgent Mozilla/3.0 (compatible; MSIE 3. MSIE 3.*
- HideRefer referrer string
- Hide certain referrer URLs under an arbitrary string (item).
Useful to map different referrer URLs for a given host to a common name.
Since only the leading string of the referrer URL is compared against
referrer, there is no need to specify wildcards. As in
HideAgent, a wildcard suffix is
removed from the string, while a wildcard prefix is taken literal.
- If the second argument contains a string in square brackets,
this defines the CGI parameter which specifies the search key for
search engines. In this case, the search key will be extracted from
the argument list and prominently displayed after the name of the
search engine/web server. See also the configuration file template
produced by http-analyze -i for more examples how
to use the HideRefer directive. Example:
- HideRefer http://altavista.digital.com/ AltaVista [q=]
- HideRefer http://lycospro.lycos.com/ Lycos [query=]
- HideRefer http://www.excite.com/ Excite [search=]
- HideRefer http://www.dino-online.de/ Dino Online [query=]
- HideSys hostname string
- Hide a hostname under an arbitrary string (item).
The string may contain blanks. If the first character of string is
a `[', this item is suppressed in the Top N lists.
Hidden items are accounted for separately, but in the summary they
are collected under the description defined with this directive.
You may use the wildcard character `*' as either a
prefix or as a suffix of the hostname (as in *.host.com and
192.168.12.*), bot not as both. Hostnames are case-insensitive.
- When building the list of countries, http-analyze determines
the country from the top-level domain given in hostname.
If hostname is an IP number, you can optionally define the
top-level domain to be accounted for by appending it in square brackets
to the string as shown in the last example below. Example:
- HideSys *.mycompany.com MY COMPANY
- HideSys 192.168.12.* MY COMPANY [COM]
- HideURL url string
- Hide an URL under an arbitrary string (item).
The string may contain blanks. If the first character of string
is a `[', this item is suppressed in the Top N lists.
Hidden items are accounted for separately, but in the summary they are
collected under the description defined with this directive. You may use
the wildcard character `*' as either a prefix or as a suffix
of the URL (as in *.map and /subdir/*), bot not as both.
URLs are case-sensitive as required by the HTTP standard. If the option
-M is specified, URLs will become case-insensitive for compatibility
with non-compliant web servers. Note that images are hidden automatically
under All images by default unless -x is specified. Example:
- HideURL *.map [All image maps]
- HideURL /robots.txt [Robot control file]
- HideURL /newsletter/* MyCompany's Monthly Newsletter
- HideURL /~delta-t/ DELTA-t Homepage
- IgnURL url and IgnSys hostname
- Ignore entries with a specific URL or accesses from a certain system.
You may use the wildcard character `*' as either a prefix or
as a suffix of the URL or the hostname (as in *.png,
/subdir/file* and *.host.com), but not as both.
Note that all logfile entries are compared against this list while
http-analyze reads the logfile opposed to the
HideURL and
HideSys directives,
which are looked up for when all entries have been reduced
to the set of unique URLs and hostnames, respectively.
Therefore, many IgnURL/IgnSys definitions will
significantly increase processing time of http-analyze. Example:
- IgnURL *.png,*.jpg,*.jpeg
- IndexFiles idxfile[,idxfile...]
- Defines additional directory index filenames (same as option -H).
The name index.html is pre-defined by default. http-analyze
truncates URLs containing an index filename so that they merge with
`/' (their "base URL").
For example, /dir/index.html is truncated to /dir/.
You can add up to 9 more names for directory index files.
Note that each name requires another table lookup, which may significantly
increase processing time. Example:
- IndexFiles Welcome.html,home.html,index.htm
- Language lang
- Use the language lang for warning messages and for the
statistics report (same as option -L). See the section
Multi-National Language Support for more
information about localization of http-analyze. Example:
- Language de
- LogFile filename
- The name of the server's logfile. If you define a default name for the
logfile, this file is processed if no other filenames are explicitely
specified on the command line. If no logfile is specified, http-analyze
always reads stdin. Example:
- LogFile /usr/ns-home/www/logs/access
- LogFormat logfmt
- Use this logfile format. Valid values for logfmt are
auto for auto-sensing the logfile format, clf for the
NCSA Common Logfile Format, or dlf and elf for
the two supported variants of the W3C Extended Logfile Format.
See the section Logfile Formats for a
detailed description of those formats. Example:
- LogFormat clf
- MSIISmode boolean value
- Use case-insensitive string comparison for URLs. Needed for MS IIS which
makes no difference between upper- and lower-case characters. MS users may
regard this as an enhancement, while for the rest of the world this is just
a violation of the RFC 2068 HTTP standard and should be ignored. Example:
- MSIISmode Yes
- NavWinSize widthxheight
- Defines the size of the navigation window which pops up in the
conventional interface if JavaScript is enabled. Useful if the browser
displays scrollbars when using the default size of 420x190 pixels. Example:
- NavWinSize 440x200
- NavigFrame size
- Defines the size of the navigation frame in pixels. Useful if the
browser displays scrollbars when using the default size of 120 pixels.
Example:
- NavigFrame 140
- NoiseLevel hits
- Sets the noise-level to hits. If a noise-level is defined,
all URLs, sites, agents and referrer URLs with hits below this level
are collected under the item Noise in the Top N lists
and overviews to avoid cluttering up those lists. Example:
- NoiseLevel 7
- OutputDir directory
- The name of the directory where the output files of the statistics
report should be created (same as option -o). By default, the
output directory is the current directory. Example:
- OutputDir /usr/www/htdocs/stats
- PageView pattern[,pattern...]
- Defines additional pageview patterns (same as option -G).
All URLs matching one of the patterns are classified as pageviews
(text files). If pattern starts (doesn't start) with a slash
(`/'), it is treated as a prefix (suffix) each URL is
compared with. The suffix .html is pre-defined by default.
You can add 9 more patterns here, for example .shtml, .text
and /cgi-bin/. Note that each pattern requires another table lookup,
which may significantly increase processing time. Example:
- PageView .shtml,.text,/cgi-bin/
- PrivateDir prvdir
- Defines the name of a "private" directory for the detailed
lists of files, sites, browsers and referrer URLs
(same as option -p). Because prvdir must reside directly under
the output directory, its name may not contain any slashes (`/').
A private directory for detailed lists may be useful to restrict access to
those lists if the rest of the statistics report is publicly available.
Note that for restricting access to the complete statistics report, you do
not need to place the detailed lists in a private directory. Example:
- PrivateDir lists
- RegInfo customer_name registration_ID
- Defines the customer's name and the registration ID, which are both
shown on the main page in the summary report. Example:
- RegInfo MyCompany 3745JMJZ00000311300000682344
- ReportTitle title
- The document title to use in the statistics report. Example:
- ReportTitle Access Statistics for MyCompany
- ServerName srvname
- The official name of the server (same as option -S).
If no server name is defined, http-analyze uses the hostname
of the system it is running on. The server name must be a full
qualified domain name, not an URL. Example:
- ServerName www.mycompany.com
- ServerURL srvurl
- The URL of the server to be used for hotlinks in URL lists (same as
option -U). Useful if the report for your web server is published
on another server. Also necessary for virtual servers to have
http-analyze generate correct hypertext links in the report. Example:
- ServerURL http://www.mycompany.com
- Session time
- The time-window for counting sessions. All unique hosts accessing
your server more than once inside this time-window are accounted for as the
same session. If the distance between two adjacend accesses from the same
host is greater than the time-window, the accesses from this host are
accounted for as different sessions. Example:
- Session 4 hours
- ShowDomain number
- Defines the number of components in a domain name which make up the
organizational part (same as option -Z). This is usually the
second-level domain, so that the last two components of the
domain name (for example, company.com) are used as the
organizationial part. However, some countries prefer to use third-level
domains, so that the hostnames use 4 or more components, where the last
3 are used for the organizational part (as in company.co.uk).
To recognize such third-level domains, ShowDomain can be set to
the value 3. Hostnames with exactly 3 components will still be reduced
to their second-level domain if ShowDomain is set to 3. Example:
- ShowDomain 3
- StripCGI boolean_value
- Do not strip arguments to CGI scripts (same as option -q).
By default, http-analyze strips arguments from CGI URLs to be
able to lump them together. If your server creates dynamic HTML files
through a CGI script, they are reduced to the URL of the script.
If StripCGI is set to Off, No, None,
False or 0, those argument lists are left intact and
CGI URLs with different arguments are treated as different URLs.
Note that this only works for requests to scripts, which receive
their arguments using the GET, but not the POST method.
See the section Interpretation of the results
for an explanation of the request methods. Example:
- StripCGI No
- Suppress subopt,...
- Suppress certain lists in the report (same as -s).
subopt may be one of:
| AVLoad |
to suppress the average load report (top seconds/minutes/hours), |
| URLs |
to suppress the overview and list of URLs/items, |
| URLList |
to suppress the list of URLs/items only, |
| Code404 |
to suppress the list of Code 404 (Not Found) responses, |
| Sites |
to suppress the overview and list of client domains, |
| RSites |
to suppress the overview of reverse client domains, |
| SiteList |
to suppress the list of all client domains/hostnames, |
| Agents |
to suppress the overview and list of browser types, |
| Referrer |
to suppress the overview and list of referrers URLs, |
| Country |
to suppress the list of countries, |
| Pageviews |
to suppress pageview rating (cached files are shown instead), |
| AuthReq |
to suppress requests which required authentication, |
| Graphics |
to suppress images such as graphs and pie charts, |
| Hotlinks |
to suppress hotlinks in the list of all URLs, |
| Interpol |
to suppress interpolation of values in graphs. |
- Example:
- Suppress Country,Interpol
- TLDFile filename
- Use filename for the list of top-level domains (same as option
-T). This list includes all ISO two-letter country domains, the
well-known domains .net, .int, .org, .com,
.edu, .gov, .mil, .arpa, .nato, and
the new CORE top-level domains .firm, .info,
.shop, .arts, .web, .rec, and .nom.
The length of a domain in the TLD file may not exceed 6 characters.
Since http-analyze uses its built-in defaults if no TLD file
is specified, you rarely will need this directive. Example:
- TLDFile /usr/local/lib/http-analyze/TLD
- TblFormat tblname specifier
- Defines the layout of tables in the statistics report. The argument
tblname may be one of:
| Month |
for the statistics of the last 12 months (main page) |
| Day |
for the daily statistics in the short and full summaries |
| Load |
for the average load by weekday, hour, minute, second |
| Country |
for the list of countries |
| TopTen |
for all Top N lists |
| Overview |
for all overviews |
| Lists |
for all detailed lists (preformatted text) |
| NotFound |
for the list of NotFound responses |
- The specifier string defines the items to be shown in the table:
| n, N |
an index number or label (don't touch!) |
| h, H |
the number of hits |
| f, F |
the number of files sent |
| c, C |
the number of cached files |
| p, P |
the number of pageviews |
| s, S |
the number of sessions |
| k, K |
the amount of data sent in Kbytes (integer value) |
| B |
the amount of data sent in bytes (float value) |
| L |
a dynamically created label (don't touch!) |
- If a format specifier is used in upper-case, the value displayed
in the report will include the percentage for this number. Example:
- TblFormat Month n h f c p s k
- Top{Days,Hours,Minutes,Seconds,URLs,Sites,Agents,Refers}, LeastURLs
- Defines the size of certain Top N tables and lists. If set to zero,
the corresponding list will be suppressed. Example:
- TopURLs 20
- LeastURLs 0
- TopDays 14
- VirtualNames virtname,...
- The list of additional ("virtual") names for this server
to be classified as self-referrer URLs. The server's primary name
(from ServerName or ServerURL) is pre-defined already.
If vname doesn't include a protocol specifier, two URLs with the
http and the https protocol specifier will be added
for each name. Since self-referrers are suppressed from the list of
referrer URLs, the remaining entries give a good impression about
external pages referring to some document on your site. Example:
- VirtualNames www2.mycompany.com,mycompany.com
- VirtualNames www.customer.com,customer.com
- VirtualNames http://www.other.com,https://secure.other.com
- VRMLProlog file
- The name of a prolog file for a yearly VRML model (same as option
-P). Pathnames not beginning with a `/' are relative
to OutputDir. If a prolog file is given, an additional yearly
model with all 12 monthly models embedded as inlines is created.
See the section Output Files for further
information about this yearly model. Example:
- VRMLProlog 3Dprolog.wrl