In this example, we show caching and named samples. Caching allows you to make sure you never overload a service from monitoring. Named samples allows logical grouping of related monitoring from different targets.
As before, copy the configuration file below as the file
/etc/monamid.d/example.conf
, overwriting any
existing file.
## ## MonAMI by Example, Section 4 ## # Our root filesystem [filesystem] name = root-fs location = / cache = 2 ❶ # Our /home filesystem [filesystem] name = home-fs location = /home cache = 2s # Bring together information about the two partitions [sample] name = partitions ❷ read = root-fs, home-fs cache = 10 # Update our snapshot every ten seconds [sample] read = partitions ❸ write = snapshot interval = 10 # Once a minute, send data to a log file. [sample] read = partitions.root-fs.capacity.available, \ ❹ partitions.home-fs.capacity.available write = filelog interval = 1m ❺ # A file containing current filesystem information [snapshot] filename = /tmp/monami-fs-current # A permanent log of a few important metrics [filelog] ❻ filename = /tmp/monami-fs-log
Some points of interest:
The cache attribute specifies a guaranteed minimum delay between successive requests for information. Here, there will always be at least two seconds between consecutive requests.
The value is a time-period: one or more words that specify how
long the period should be. This is the same format as the
sample interval attribute, so
“ | |
Like all targets, this name must be unique. | |
This sample reads all available metrics from the
| |
Sometimes attribute lines can get quite long. To make them
easier to read and edit, long lines can be broken down into
multiple shorter lines provided the last character is a
backslash ( | |
This interval is deliberately short to allow quick gathering of information. For normal use a much longer interval would be more appropriate. | |
The filelog plugin creates a file, if it does not already exit, and appends a new line for each datatree it receives. It is a simple method of archiving monitoring information. |
With this example, you should leave MonAMI running for a few
minutes. Whilst it is running, you can check that data is being
appended to the log file (/tmp/monami-fs-log
)
correctly using, for example, the cat program.
Depending on which version of MonAMI you are using and the current
state of your partitions, the file
/tmp/monami-fs-current
should look like:
"partitions.root-fs.fragment size" "1024" (B) [every 10s] "partitions.root-fs.blocks.size" "1024" (B) [every 10s] "partitions.root-fs.blocks.total" "264445" (blocks) [every 10s] "partitions.root-fs.blocks.free" "142771" (blocks) [every 10s] "partitions.root-fs.blocks.available" "129118" (blocks) [every 10s] "partitions.root-fs.capacity.total" "258.24707" (MiB) [every 10s] "partitions.root-fs.capacity.free" "139.424805" (MiB) [every 10s] "partitions.root-fs.capacity.available" "126.091797" (MiB) [every 10s] "partitions.root-fs.capacity.used" "118.822266" (MiB) [every 10s] "partitions.root-fs.files.used" "68272" (files) [every 10s] "partitions.root-fs.files.free" "56294" (files) [every 10s] "partitions.root-fs.files.available" "56294" (files) [every 10s] "partitions.root-fs.flag" "0" () [every 10s] "partitions.root-fs.namemax" "255" () [every 10s] "partitions.home-fs.fragment size" "4096" (B) [every 10s] "partitions.home-fs.blocks.size" "4096" (B) [every 10s] "partitions.home-fs.blocks.total" "16490546" (blocks) [every 10s] "partitions.home-fs.blocks.free" "3699442" (blocks) [every 10s] "partitions.home-fs.blocks.available" "2861754" (blocks) [every 10s] "partitions.home-fs.capacity.total" "64416.195312" (MiB) [every 10s] "partitions.home-fs.capacity.free" "14450.945312" (MiB) [every 10s] "partitions.home-fs.capacity.available" "11178.726562" (MiB) [every 10s] "partitions.home-fs.capacity.used" "49965.25" (MiB) [every 10s] "partitions.home-fs.files.used" "8388608" (files) [every 10s] "partitions.home-fs.files.free" "8008117" (files) [every 10s] "partitions.home-fs.files.available" "8008117" (files) [every 10s] "partitions.home-fs.flag" "0" () [every 10s] "partitions.home-fs.namemax" "255" () [every 10s]
The file /tmp/monami-fs-log
should look like:
# time partitions.root-fs.capacity.available partitions. home-fs.capacity.available 2007-10-03 11:12:59 126.091797 11178.707031 2007-10-03 11:13:59 126.091797 11178.703125 2007-10-03 11:14:59 126.091797 11178.703125 2007-10-03 11:15:59 126.091797 11178.710938
A named sample target is simply a sample target that has a name attribute specified. In contrast, a sample without any specified name attribute is an anonymous sample. All the samples in previous sections are anonymous.
The main use for named samples is to allow grouping of monitoring
data. Suppose you wanted to monitor multiple attributes about a
service; for example, count active TCP
connections, watch the
application's use of the database, and count number of daemons
running. You may, for ease of handling, want to build a datatree
containing the combined set of metrics. A named sample allows you
to do this.
Another aspect of named targets is that it allows other targets
(such as anonymous samples) to request monitoring data from the
named sample. Named samples can be used, in effect, as simple
monitoring target (such as root-fs
target above).
In fact, anonymous sample sections do have a name: their name is assigned automatically when MonAMI starts. However, you should never use this name or need to know it. If you find you need to collect data from an anonymous sample, simply give the target a name.
Note that, although not illustrated in the above example, named samples will honour the interval attribute. This allows them to provide periodic monitoring information (in common with anonymous samples) whilst simultaneously allowing other targets to request information at other times.
Monitoring will always incur some cost (computational, memory and sometimes storage and network bandwidth usage). Sometimes this cost is sufficiently high that we might want to rate-limit any queries so, for example, a service is never monitored more than once every minute.
Within MonAMI, this is achieved with the
cache attribute. You can configure any
target to cache gathered metrics for a period. In the above
example, metrics from the partitions
named sample are
cached for ten seconds. If one of the anonymous samples had the
interval attribute set to less than 10 seconds, they would not
trigger any gathering of fresh data. Instead, they would receive
the previous (cached) result until the ten-second cache had
expired.
By default MonAMI will cache all results for one second. Since MonAMI monitoring frequency (the interval attribute) has a granularity of one second, this default cache will not be noticed when a target obtains data. However, if multiple targets request data from the same target at almost the same time (to within a second), the default cache ensures all the requests receive data from a single datatree.
Note that the cache attribute works for sample targets, as demonstrated in the above example. Caching targets with different cache-intervals allows a conservative level of caching for the bulk of the monitoring activity whilst retaining the possibility of adding more frequent monitoring.
Some monitoring plugins will report a different set of metrics over
time; this causes the structure of the
datatree changes due to the number of reported metrics varying.
Most often this happens when the service being monitored changes
availability (when a service “goes down” or
“comes up”), although some services report additional
metrics once they have stabilised. The Apache HTTP
server is an
example; after an initial delay, it provides a measure of bandwidth
usage.
When a change in a datatree structure is detected, MonAMI will invalidate all its internal caches that use this datatree; independent caches are left unaffected. Subsequent requests to a target for fresh data will gather new data, either freshly cached or direct from the monitoring target. This allows the new structure to propagate independent of the cache attributes.