CONFIGURATION FILE

You can define server-specific configuration settings for http-analyze in an analyzer configuration file. To have the analyzer use such a configuration file, specify its name with the options -c cfgfile or the environment variable HA_CONFIG. Note that command line options always take preceedence over settings in a configuration file.

If the option -i newcfg is specified, http-analyze creates a configuration template in the file newcfg. Any other command line options present will be transformed into its appropriate definitions in the new configuration file. The settings then can be customized further by manually editing the configuration definitions using a standard text editor.

To update an old configuration file into a new format, specify its name using the option -c in addition to -i. This will instruct the analyzer to retain any settings from the old file.

The configuration file contains a single directive per line. Except for IndexFiles, PageView, AddDomain, VirtualNames, Ign*, and Hide*, each directive may appear only once in the configuration file. Following a directive field there are one or two value fields, which must be separated from the directive and each other by one or more tabulators. Blanks are considered part of the string in an optional third field only. All directive names are case-insensitive. Comment lines starting with a hash character (#) are ignored.

3DWinSize widthxheight
Defines the size of the 3D window (default: 520x420 pixels). Example:
3DWinSize 540x450
3DWindow keyword
Defines the 3D window the VRML model is displayed in (same as option -W). The keyword may be either extern (default) or intern for display of the VRML model in a new, external window or in the lower half of the main frame respectively. Example:
3DWindow intern
AddDomain domain string
Add entries to the domain table causing certain domains to be allocated to the mock domain string. Wildcards in domain are ignored. This directive is useful to collect certain hostnames (for example the hosts of world-wide operating online services), under some string (item) instead of the country associated with the top-level-domain. Example:
AddDomain .compuserve.com CompuServe
AuthURL boolean_value
Defines whether accesses which required authentication should be skipped. By default, such URLs appear in the report just like ordinary URLs. If AuthURL is set to Off, No, None, False, or 0 the analyzer skips authenticated requests in the logfile, so that they will be suppressed from the statistics report. Example:
AuthURL No
BtnSymlink boolean_value
Creates symbolic links to the required buttons and files in HA_LIBDIR instead of copying them into the output directory. If you are going to analyze a large number of virtual servers which reside on the same host, you can probably save disk space by avoiding copies of all buttons and files into each output directory. Note that this directive can be used only on systems which support symbolic links. Example:
BtnSymlink Yes
CustLogoW image srvurl and CustLogoB image srvurl
Defines images for use as customer logos in the statistics report. This feature is available only in the commercial version of the analyzer. image is the name of the image file relative to the output directory OutputDir and srvurl is the URL to be followed if the user clicks on the image. To use your own logos create two images - one for use on white backgrounds (CustLogoW) and another one for use on black backgrounds (CustLogoB). The images should be approximately 72x72 pixels in size and must be placed into the buttons subdirectory of the central libdir (HA_LIBDIR/btn). Next time a report is generated, the analyzer copies those logos into the output directory and includes them in the report. Example:
CustLogoW btn/mycompany_sw.png http://www.mycompany.com/
CustLogoB btn/mycompany_sb.png http://www.mycompany.com/
DefaultMode mode
The default operation mode of http-analyze. The value field contains either the keyword daily for short statistics mode or monthly for full statistics mode (see also options -d and -m). If left undefined, the default is full statistics mode (monthly). Example:
DefaultMode daily
DocRoot docroot
Restricts logfile analysis to the given Document Root (same as option -R). If docroot is prefixed by a `!', analysis takes place for all directories except docroot. If docroot does not start with a slash (`/'), it is interpreted as the name of a virtual server, which is matched against the normally unused second field of a logfile entry. Useful for virtual servers with a separate Document Root. Note: Do not define this directive to analyze the whole server. Explicitely setting DocRoot to `/' (the default) only increases processing time. Example:
DocRoot /customer/ DocRoot www.customer.com
HTMLCharSet chrset
Force use of chrset for the browser's encoding when displaying the statistics report (same as option -C). This is needed for languages which require special character sets such as Chinese. See also the section about Multi-National Language Support. Example:
HTMLCharSet iso-8859-1
HTMLPrefix prefix and HTMLTrailer trailer
The HTML prefix and trailer to be inserted into the statistics output files at the top and bottom of the page. If defined, the HTMLPrefix string must include the <BODY> tag. To read the HTML code from a file, specify its name as the prefix or trailer. Example:
HTMLPrefix <BODY BGCOLOR="#FF0000">
HTMLTrailer <A HREF="/intern/">Back</A> to the internal page.
HeadFont fontlist, TextFont fontlist and ListFont fontlist
The fonts to use for headers, for regular text, and for the detailed lists. If unset, the analyzer uses a list of common serif-less fonts for headers and regular text and a monospaced (fixed) font for the detailed lists. To force the navigator's default for fonts, use the keyword default as the fontname. Example:
HeadFont Helvetica,Arial,Geneva,sans-serif
TextFont Helvetica,Arial,Geneva,sans-serif
ListFont Courier,fixed
HeadSize size, TextSize size, SmallSize size and ListSize size
The font sizes for headings (navigator default, usually 3), regular text (default: 2), small text (default: 1) and lists (default: 2). TextSize replaces the former FontSize, which is still recognized for backward compatibility with older configuration files. Example:
HeadSize 4
TextSize 3
SmallSize 2
HideAgent agent string
Hide a browser type under an arbitrary string (item). Needed only for a certain browser whose vendor still can't spell its name correctly. Only the leading part of the browser type is compared against agent, so no wildcards are needed in the second field. Example:
HideAgent Mozilla/4.0 (compatible; MSIE 4. MSIE 4.*
HideAgent Mozilla/3.0 (compatible; MSIE 3. MSIE 3.*
HideRefer referrer string
Hide certain referrer URLs under an arbitrary string (item). Useful to map different referrer URLs for a given host to a common name. Since only the leading string of the referrer URL is compared against referrer, there is no need to specify wildcards. As in HideAgent, a wildcard suffix is removed from the string, while a wildcard prefix is taken literal.
If the second argument contains a string in square brackets, this defines the CGI parameter which specifies the search key for search engines. In this case, the search key will be extracted from the argument list and prominently displayed after the name of the search engine/web server. See also the configuration file template produced by http-analyze -i for more examples how to use the HideRefer directive. Example:
HideRefer http://altavista.digital.com/ AltaVista [q=]
HideRefer http://lycospro.lycos.com/ Lycos [query=]
HideRefer http://www.excite.com/ Excite [search=]
HideRefer http://www.dino-online.de/ Dino Online [query=]
HideSys hostname string
Hide a hostname under an arbitrary string (item). The string may contain blanks. If the first character of string is a `[', this item is suppressed in the Top N lists. Hidden items are accounted for separately, but in the summary they are collected under the description defined with this directive. You may use the wildcard character `*' as either a prefix or as a suffix of the hostname (as in *.host.com and 192.168.12.*), bot not as both. Hostnames are case-insensitive.
When building the list of countries, http-analyze determines the country from the top-level domain given in hostname. If hostname is an IP number, you can optionally define the top-level domain to be accounted for by appending it in square brackets to the string as shown in the last example below. Example:
HideSys *.mycompany.com MY COMPANY
HideSys 192.168.12.* MY COMPANY [COM]
HideURL url string
Hide an URL under an arbitrary string (item). The string may contain blanks. If the first character of string is a `[', this item is suppressed in the Top N lists. Hidden items are accounted for separately, but in the summary they are collected under the description defined with this directive. You may use the wildcard character `*' as either a prefix or as a suffix of the URL (as in *.map and /subdir/*), bot not as both. URLs are case-sensitive as required by the HTTP standard. If the option -M is specified, URLs will become case-insensitive for compatibility with non-compliant web servers. Note that images are hidden automatically under All images by default unless -x is specified. Example:
HideURL *.map [All image maps]
HideURL /robots.txt [Robot control file]
HideURL /newsletter/* MyCompany's Monthly Newsletter
HideURL /~delta-t/ DELTA-t Homepage
IgnURL url and IgnSys hostname
Ignore entries with a specific URL or accesses from a certain system. You may use the wildcard character `*' as either a prefix or as a suffix of the URL or the hostname (as in *.png, /subdir/file* and *.host.com), but not as both. Note that all logfile entries are compared against this list while http-analyze reads the logfile opposed to the HideURL and HideSys directives, which are looked up for when all entries have been reduced to the set of unique URLs and hostnames, respectively. Therefore, many IgnURL/IgnSys definitions will significantly increase processing time of http-analyze. Example:
IgnURL *.png,*.jpg,*.jpeg
IndexFiles idxfile[,idxfile...]
Defines additional directory index filenames (same as option -H). The name index.html is pre-defined by default. http-analyze truncates URLs containing an index filename so that they merge with `/' (their "base URL"). For example, /dir/index.html is truncated to /dir/. You can add up to 9 more names for directory index files. Note that each name requires another table lookup, which may significantly increase processing time. Example:
IndexFiles Welcome.html,home.html,index.htm
Language lang
Use the language lang for warning messages and for the statistics report (same as option -L). See the section Multi-National Language Support for more information about localization of http-analyze. Example:
Language de
LogFile filename
The name of the server's logfile. If you define a default name for the logfile, this file is processed if no other filenames are explicitely specified on the command line. If no logfile is specified, http-analyze always reads stdin. Example:
LogFile /usr/ns-home/www/logs/access
LogFormat logfmt
Use this logfile format. Valid values for logfmt are auto for auto-sensing the logfile format, clf for the NCSA Common Logfile Format, or dlf and elf for the two supported variants of the W3C Extended Logfile Format. See the section Logfile Formats for a detailed description of those formats. Example:
LogFormat clf
MSIISmode boolean value
Use case-insensitive string comparison for URLs. Needed for MS IIS which makes no difference between upper- and lower-case characters. MS users may regard this as an enhancement, while for the rest of the world this is just a violation of the RFC 2068 HTTP standard and should be ignored. Example:
MSIISmode Yes
NavWinSize widthxheight
Defines the size of the navigation window which pops up in the conventional interface if JavaScript is enabled. Useful if the browser displays scrollbars when using the default size of 420x190 pixels. Example:
NavWinSize 440x200
NavigFrame size
Defines the size of the navigation frame in pixels. Useful if the browser displays scrollbars when using the default size of 120 pixels. Example:
NavigFrame 140
NoiseLevel hits
Sets the noise-level to hits. If a noise-level is defined, all URLs, sites, agents and referrer URLs with hits below this level are collected under the item Noise in the Top N lists and overviews to avoid cluttering up those lists. Example:
NoiseLevel 7
OutputDir directory
The name of the directory where the output files of the statistics report should be created (same as option -o). By default, the output directory is the current directory. Example:
OutputDir /usr/www/htdocs/stats
PageView pattern[,pattern...]
Defines additional pageview patterns (same as option -G). All URLs matching one of the patterns are classified as pageviews (text files). If pattern starts (doesn't start) with a slash (`/'), it is treated as a prefix (suffix) each URL is compared with. The suffix .html is pre-defined by default. You can add 9 more patterns here, for example .shtml, .text and /cgi-bin/. Note that each pattern requires another table lookup, which may significantly increase processing time. Example:
PageView .shtml,.text,/cgi-bin/
PrivateDir prvdir
Defines the name of a "private" directory for the detailed lists of files, sites, browsers and referrer URLs (same as option -p). Because prvdir must reside directly under the output directory, its name may not contain any slashes (`/'). A private directory for detailed lists may be useful to restrict access to those lists if the rest of the statistics report is publicly available. Note that for restricting access to the complete statistics report, you do not need to place the detailed lists in a private directory. Example:
PrivateDir lists
RegInfo customer_name registration_ID
Defines the customer's name and the registration ID, which are both shown on the main page in the summary report. Example:
RegInfo MyCompany 3745JMJZ00000311300000682344
ReportTitle title
The document title to use in the statistics report. Example:
ReportTitle Access Statistics for MyCompany
ServerName srvname
The official name of the server (same as option -S). If no server name is defined, http-analyze uses the hostname of the system it is running on. The server name must be a full qualified domain name, not an URL. Example:
ServerName www.mycompany.com
ServerURL srvurl
The URL of the server to be used for hotlinks in URL lists (same as option -U). Useful if the report for your web server is published on another server. Also necessary for virtual servers to have http-analyze generate correct hypertext links in the report. Example:
ServerURL http://www.mycompany.com
Session time
The time-window for counting sessions. All unique hosts accessing your server more than once inside this time-window are accounted for as the same session. If the distance between two adjacend accesses from the same host is greater than the time-window, the accesses from this host are accounted for as different sessions. Example:
Session 4 hours
ShowDomain number
Defines the number of components in a domain name which make up the organizational part (same as option -Z). This is usually the second-level domain, so that the last two components of the domain name (for example, company.com) are used as the organizationial part. However, some countries prefer to use third-level domains, so that the hostnames use 4 or more components, where the last 3 are used for the organizational part (as in company.co.uk). To recognize such third-level domains, ShowDomain can be set to the value 3. Hostnames with exactly 3 components will still be reduced to their second-level domain if ShowDomain is set to 3. Example:
ShowDomain 3
StripCGI boolean_value
Do not strip arguments to CGI scripts (same as option -q). By default, http-analyze strips arguments from CGI URLs to be able to lump them together. If your server creates dynamic HTML files through a CGI script, they are reduced to the URL of the script. If StripCGI is set to Off, No, None, False or 0, those argument lists are left intact and CGI URLs with different arguments are treated as different URLs. Note that this only works for requests to scripts, which receive their arguments using the GET, but not the POST method. See the section Interpretation of the results for an explanation of the request methods. Example:
StripCGI No
Suppress subopt,...
Suppress certain lists in the report (same as -s). subopt may be one of:
AVLoad to suppress the average load report (top seconds/minutes/hours),
URLs to suppress the overview and list of URLs/items,
URLList to suppress the list of URLs/items only,
Code404 to suppress the list of Code 404 (Not Found) responses,
Sites to suppress the overview and list of client domains,
RSites to suppress the overview of reverse client domains,
SiteList to suppress the list of all client domains/hostnames,
Agents to suppress the overview and list of browser types,
Referrer to suppress the overview and list of referrers URLs,
Country to suppress the list of countries,
Pageviews to suppress pageview rating (cached files are shown instead),
AuthReq to suppress requests which required authentication,
Graphics to suppress images such as graphs and pie charts,
Hotlinks to suppress hotlinks in the list of all URLs,
Interpol to suppress interpolation of values in graphs.
Example:
Suppress Country,Interpol
TLDFile filename
Use filename for the list of top-level domains (same as option -T). This list includes all ISO two-letter country domains, the well-known domains .net, .int, .org, .com, .edu, .gov, .mil, .arpa, .nato, and the new CORE top-level domains .firm, .info, .shop, .arts, .web, .rec, and .nom. The length of a domain in the TLD file may not exceed 6 characters. Since http-analyze uses its built-in defaults if no TLD file is specified, you rarely will need this directive. Example:
TLDFile /usr/local/lib/http-analyze/TLD
TblFormat tblname specifier
Defines the layout of tables in the statistics report. The argument tblname may be one of:
Month for the statistics of the last 12 months (main page)
Day for the daily statistics in the short and full summaries
Load for the average load by weekday, hour, minute, second
Country for the list of countries
TopTen for all Top N lists
Overview for all overviews
Lists for all detailed lists (preformatted text)
NotFound for the list of NotFound responses
The specifier string defines the items to be shown in the table:
n, N an index number or label (don't touch!)
h, H the number of hits
f, F the number of files sent
c, C the number of cached files
p, P the number of pageviews
s, S the number of sessions
k, K the amount of data sent in Kbytes (integer value)
B the amount of data sent in bytes (float value)
L a dynamically created label (don't touch!)
If a format specifier is used in upper-case, the value displayed in the report will include the percentage for this number. Example:
TblFormat Month n h f c p s k
Top{Days,Hours,Minutes,Seconds,URLs,Sites,Agents,Refers}, LeastURLs
Defines the size of certain Top N tables and lists. If set to zero, the corresponding list will be suppressed. Example:
TopURLs 20
LeastURLs 0
TopDays 14
VirtualNames virtname,...
The list of additional ("virtual") names for this server to be classified as self-referrer URLs. The server's primary name (from ServerName or ServerURL) is pre-defined already. If vname doesn't include a protocol specifier, two URLs with the http and the https protocol specifier will be added for each name. Since self-referrers are suppressed from the list of referrer URLs, the remaining entries give a good impression about external pages referring to some document on your site. Example:
VirtualNames www2.mycompany.com,mycompany.com
VirtualNames www.customer.com,customer.com
VirtualNames http://www.other.com,https://secure.other.com
VRMLProlog file
The name of a prolog file for a yearly VRML model (same as option -P). Pathnames not beginning with a `/' are relative to OutputDir. If a prolog file is given, an additional yearly model with all 12 monthly models embedded as inlines is created. See the section Output Files for further information about this yearly model. Example:
VRMLProlog 3Dprolog.wrl