Monitoring Device Availability and Latency

This section describes how to monitor device availability and latency in Skylar One (formerly SL1).

Use the following menu options to navigate the Skylar One user interface:

To view a pop-out list of menu options, click the menu icon ().
To view a page containing all of the menu options, click the Advanced menu icon ().

Availability

Availability means a device's ability to accept connections and data from the network. During polling, a device has two possible availability values:

100%. Device is up and running.
0%. Device is not accepting connections and data from the network.

By default, the method Skylar One uses to monitor availability of the device is determined by the first method of discovery:

If the Skylar One agent is installed and creates a device record before the device is discovered as an SNMP or pingable device, availability is measured based on whether the agent is reporting data to Skylar One.
If the device is discovered as an SNMP or pingable device before the agent is installed, availability is measured based on the method used to discover the device (SNMP, ICMP, or TCP).

If a device or interface becomes unavailable multiple times in a specified time frame, Skylar One can generate an "availability flapping" event. By default, Skylar One generates an event if a device becomes unavailable three times in an hour, or if an interface becomes unavailable three times in twenty-four hours.

To generate availability reports, Skylar One must be configured to collect availability and latency data from devices. The following section describes how to configure Skylar One to collect this data.

NOTE: Unlike for hardware-based devices, Skylar One does not use ICMP, TCP, or UDP to monitor availability for component devices. Component Devices use a Dynamic Application collection object to measure availability. Skylar One polls component devices for availability at the frequency defined in the Dynamic Application.

Configuring Availability Monitoring on a Device

Skylar One uses ports to monitor a device's availability. You specify which ports to use for device availability in the Device Properties page.

NOTE: Unlike for hardware-based devices, Skylar One does not use ICMP, TCP, or UDP to monitor availability for component devices. Component devices use a Dynamic Application collection object to measure availability. Skylar One polls component devices for availability at the frequency defined in the Dynamic Application. For more information, see the section Configuring Availability for Component Devices.

To configure availability monitoring for a device:

Go to the Device Manager page (Devices > Classic Devices, or Registry > Devices > Device Manager in the classic SL1 user interface).
In the Device Manager page, find the device for which you want to configure availability monitoring. Click its wrench icon (). The Device Properties page displays.
In the Device Properties page, edit the following fields:

Availability Port . Specifies the protocol (first drop-down menu) and specific port (second drop-down menu) that Skylar One should monitor to determine if the device is available. The list of ports will contain all the ports discovered by Skylar One. The data collected from this port will be used in device availability reports. Protocol options include:

TCP. Availability is based on whether Skylar One can connect to the device using the specified TCP port.
ICMP. Availability is based on whether the device responds to an ICMP ping request from Skylar One. If you select ICMP as the protocol, you can use the ICMP Availability Thresholds fields in the Device Thresholds page to further define how Skylar One will test the device's availability.
SNMP. Availability is based on whether the device responds to an SNMP GET request from Skylar One.
ScienceLogic Agent. Availability is based on whether the Skylar One Agent is reporting data to Skylar One. The agent must be installed on the device to use this option.

Avail + Latency Alert. Specifies how Skylar One should respond when the device fails an availability check, a latency check, or both. These options allow you to create separate events when SNMP fails on a device and when a device is not up and running (indicated by the device failing both the availability check and the latency check). Choices are:

Enabled. Skylar One will create the following events:

If the device fails the availability check, generates the event "Device Failed Availability Check: UDP - SNMP".

If the device fails the latency check, generates the event, "Network Latency Exceeded Threshold: No Response".
If the device fails both the availability check and the latency check, generates the event "Device Failed Availability and Latency checks".

Disabled. Skylar One will create the following events:

If the device fails the availability check, generates the event "Device Failed Availability Check: UDP - SNMP".
If the device fails the latency check, generates the event, "Network Latency Exceeded Threshold: No Response".
If the device fails both the availability check and the latency check, generates the Major event "Device Failed Availability Check: UDP - SNMP". The Minor event "Network Latency Exceeded Threshold: No Response" is rolled up under the availability event.

Click Save.

NOTE: The Ping & Poll Timeout (Msec) setting in the Behavior Settings page (System > Settings > Behavior) affects how Skylar One monitors device availability. This field specifies the number of milliseconds the discovery tool and availability polls will wait for a response after pinging a device. After the specified number of milliseconds have elapsed, the poll will timeout.

Defining Availability Thresholds

Skylar One allows you to define global Availability Thresholds that apply to all devices and device-specific Availability Thresholds that apply to a selected device. When a device fails to meet the availability threshold (that is, is not available as specified in the threshold), Skylar One generates an event about the device.

For details on defining availability thresholds, see the section on Thresholds and Data Retention.

Configuring Availability for Component Devices

Dynamic Applications that create component devices have the Component Mapping checkbox selected in the Dynamic Applications Properties Editor page and also include the Component Identifiers field.

In the Component Identifiers field, you map the value of a collection object to the Device Name identifier and Unique Identifier identifier, so Skylar One can create one or more component devices.

In the Component Identifiers field, you can also map a collection object to the Availability identifier. For hardware-based devices, Skylar One monitors an ICMP, TCP, or UDP port to determine availability. Because component devices might not include ICMP, TCP, or UDP ports, you must use a Component Identifier to determine availability.

To configure Skylar One to monitor availability for a component device:

Go to the Dynamic Applications Manager page (System > Manage > Dynamic Applications).
Find the Dynamic Application that creates and monitors the component devices you are interested in. Click its wrench icon ().
In the Dynamic Applications Properties Editor page, examine the Component Mapping checkbox. If the checkbox is selected, this is the correct Dynamic Application to edit.
Click the Collections tab.
In the list of Collection Objects in the Collection Object Registry pane, determine which collection object will always be available if the component device is available. Click on the wrench icon () for that collection object.
In the Component Identifiers field, select:

Availability. Object that specifies whether a component device is available. If Skylar One can collect a value for a component device using the aligned collection object and the value is not 0 (zero) or "false", Skylar One considers the component device as "available". If Skylar One cannot collect a value for a component device using the aligned collection object or Skylar One collects a value that is 0 (zero) or "false", Skylar One considers the component device as "unavailable".

If the collection objects aligned with the Device Name and Unique Identifier component identifiers return lists of values, Skylar One will create multiple component devices. Each component device will be associated with an index, i.e. a location in the list of values. If all the component devices in the list should be considered available, the collection object aligned with the Availability component identifier should return a list of values with a value at each index associated with a component device. A component device is unavailable when the list of values returned by the collection object aligned with the Availability component identifier does not include a value at the index or returns a value of 0 (zero) or false at the index for the component device. For more information about Dynamic Application indexing, see the Dynamic Application Development section.

If you align a collection object with this component identifier, Skylar One will create a system availability graph for each component device in the Device Performance page.
If you align a collection object with this component identifier and Skylar One cannot collect a value for a component device using the aligned collection object Skylar One will supply the Value "Unavailable" in the Collection State column in the Device Components page.

Click Save. Skylar One will now monitor availability and graph availability statistics for the component devices aligned with the Dynamic Application.

Critical Ping

Critical Ping is a tool that allows you to monitor a device as frequently as every five seconds. If the device does not respond, Skylar One creates an event. You can enable or disable critical ping for a device from its Device Properties page (Devices > Classic Devices > wrench icon).

Skylar One does not use critical ping to create device-availability reports. Skylar One will continue to collect device-availability data only every five minutes, as specified in the process "Data Collection:Availability" in the Process Manager page (System > Settings > Admin Processes).

Critical Ping uses the following global default values:

Ping Count. This field specifies the number of packets that should be sent during each critical ping. The default value is "1".
Required Ping Percentage. This field specifies the percentage of packets that must be returned during a critical ping before Skylar One considers the device available. The default value is "100%".
Packet Size. This field specifies the size of each packet, in bytes, that is sent during each critical ping. The default value is "56 bytes".

To adjust these global values or to allow Critical Ping to inherit the per-device values for ICMP Availability Thresholds defined in the in the Device Thresholds page (Devices > Classic Devices > wrench icon > Thresholds, or Registry > Devices > Device Manager > wrench icon > Thresholds in the classic SL1 user interface), contact ScienceLogic Customer Support.

To define critical ping for a device:

Go to the Device Manager page (Devices > Classic Devices, or Registry > Devices > Device Manager in the classic SL1 user interface).
In the Device Manager page, find the device for which you want to configure availability monitoring. Click its wrench icon (). The Device Properties page displays.
In the Device Properties page, edit the following fields:

Critical Ping. Frequency with which Skylar One should ping the device in addition to the five minute availability poll. If the device does not respond, Skylar One creates an event. The choices are:

Disabled. Skylar One will not ping the device in addition to the five minute availability poll.
Intervals from every 120 seconds - every 5 seconds.

NOTE: Skylar One does not use this ping data to create device-availability reports. Skylar One will continue to collect device availability data only every five minutes, as specified in the process "Data Collection:Availability" in the Process Manager page (System > Settings > Admin Processes).

NOTE: Because high-frequency data pull occurs every 15 seconds, you might experience up to 15 seconds of latency between an unavailable alert and that alert appearing in the Database Server if you set Critical Ping to 5 seconds.

TIP: You might experience some performance issues if you have a large number of devices using critical ping on a short polling interval. If you have a large number of devices and are experiencing a delay in events being generated for a critical ping outage, try increasing the interval time.

Click Save.

Latency

Latency means the amount of time it takes Skylar One to communicate with a device. Specifically, latency refers to the amount of time between when Skylar One initiates communication with a device and when the device responds and allows communication. Latency is expressed in milliseconds (ms).

The latency calculation that is reported in Skylar One varies based on the method used to check it:

For TCP, Skylar One reports half of the time it takes for the connection to be opened.
For ICMP, Skylar One reports half of the round-trip time for a ping.
For UDP, Skylar One reports half of the time it takes to call getnext on .1.3.6.1 and receive a response.

Skylar One uses ports to monitor a device's latency. You specify which ports to use for device latency on the Settings tab of the Device Investigator page (or the Device Properties page in the classic Skylar One user interface).

Configuring Latency Monitoring on a Device

Skylar One uses ports to monitor a device's latency. You specify which ports to use for device latency in the Device Properties page.

To configure latency monitoring for a device:

Go to the Device Manager page (Devices > Classic Devices, or Registry > Devices > Device Manager in the classic SL1 user interface).
In the Device Manager page, find the device for which you want to configure latency monitoring. Select its wrench icon ().
The Device Properties page appears.
In the Device Properties page, edit the following fields:

Latency Port. Specifies the protocol (first drop-down menu) and specific port (second drop-down menu) Skylar One should monitor to determine latency for the device. The list of ports will contain all the ports discovered by Skylar One. The data collected from this port will be used in device latency reports.

If you select ICMP as the protocol, you can use the ICMP Availability Thresholds in the Device Thresholds page to further define how Skylar One will test the device's latency.

Avail + Latency Alert. Specifies how Skylar One should respond when the device fails an availability check, a latency check, or fails both. These options allow you to create separate events when SNMP fails on a device and when a device is not up and running. Choices are:

Enabled. Skylar One will create the following events:

If the device fails the availability check, generates the event "Device Failed Availability Check: UDP - SNMP".
If the device fails the latency check, generates the event, "Network Latency Exceeded Threshold: No Response".
If the device fails both the availability check and the latency check, generates the event "Device Failed Availability and Latency checks".

Disabled. Skylar One will create the following events:

If the device fails the availability check, generates the event "Device Failed Availability Check: UDP - SNMP".
If the device fails the latency check, generates the event, "Network Latency Exceeded Threshold: No Response".
If the device fails both the availability check and the latency check, generates only the event "Device Failed Availability Check: UDP - SNMP". The event "Network Latency Exceeded Threshold: No Response" is suppressed under the availability event.

Defining Latency Thresholds

Skylar One allows you to define global Latency Thresholds that apply to all devices and device-specific Latency Thresholds that apply only to a specific device. When a device fails to meet the latency threshold (that is, takes longer than the specified time-span to respond), Skylar One generates an event about the device. For example, if the latency threshold is "100 ms", when a device does not respond to a poll within 100 ms, Skylar One will generate an event about that device.

To disable the latency threshold for a single device, set the threshold to 0% (zero percent). When you disable a threshold, Skylar One does not generate an event for the threshold.

For details on defining latency thresholds, see the section on Thresholds and Data Retention.

Viewing Reports on Device Availability and Device Latency

See the section on Viewing Performance Graphs for information and examples of reports for device availability and device latency.