Collector Groups

This section provides an overview of collector groups in SL1. A collector group—sometimes referred to as a CUG—is a group of SL1 Data Collectors that retrieve data from managed devices and applications so you can use that data in SL1.

Use the following menu options to navigate the SL1 user interface:

To view a pop-out list of menu options, click the menu icon ().
To view a page containing all of the menu options, click the Advanced menu icon ().

What is a Collector Group?

A collector group—sometimes referred to as a CUG—is a group of SL1 Data Collectors. Data Collectors retrieve data from managed devices and applications. This collection occurs during initial discovery, during nightly updates, and in response to policies defined for each managed device. The collected data is used to trigger events, display data in the user interface, and generate graphs and reports.

You can group multiple Data Collectors into a collector group. Depending on the number of Data Collectors in your SL1 system, you can define one or more collector groups. Each collector group must include at least one Data Collector.

On the Collector Groups page (Manage > Collector Groups)—or the Collector Group Management page (System > Settings > Collector Groups) in the classic SL1 user interface—you can view a list of existing collector groups, add a collector group, and edit a collector group.

System upgrades will only consider Data Collectors and Message Collectors that are members of a collector group.

Grouping multiple Data Collectors allows you to:

Create a load-balanced collection system, where you can manage more devices without loss of performance. At any given time, the Data Collector with the lightest load manages the next discovered device.
Optionally, create a redundant, high-availability system that minimizes downtime should a failure occur. If a Data Collector fails, one or more Collection servers in the collector group will handle collection until the problem is solved.

NOTE: If you are using a SL1 All-In-One Appliance, most of the sections in this chapter do not apply to your system. For an All-In-One Appliance, a single, default collector group is included with the appliance; you cannot create any additional collector groups. However, you can view information about the default collector group. You can also create a virtual collector group, for data storage only. However, the other tasks described in this section do not apply to an All-In-One Appliance.

Installing, Configuring, and Licensing Data Collectors

Before you can create a collector group, you must install and license at least one Data Collector. For details on installation and licensing of a Data Collector, see the Installation section.

After you have successfully installed, configured, and licensed a Data Collector, the platform automatically adds information about the Data Collector to the Database Server.

For more information on using external credential services that store, collect, and retrieve secret data, see the section on Using External Credential Services in the Discovery and Credentials manual.

Technical Information About Data Collectors

You might find the following technical information about Data Collectors helpful when creating collector groups.

Duplicate IP Addresses

A single collector group cannot include multiple devices that use the same Admin Primary IP Address (this is the IP address the platform uses to communicate with a device). If a single collector group includes multiple devices that use the same Primary IP Address or use the same Secondary IP Address, the platform will generate an event. Best practice is to ensure that within a single collector group, all IP addresses on all devices are unique.

During initial discovery, if a device is discovered with the same Admin Primary IP Address as a previously discovered device in the collector group, the later discovered device will appear in the discovery log, but will not be modeled in the platform. That is, the device will not be assigned a device ID and will not be created in the platform. The platform will generate an event specifying that a duplicate Admin Primary IP was discovered within the collector group.
If you try to assign a device to a collector group, and the device's Admin Primary IP Address already exists in the collector group, the platform will display an error message, and the device will not be aligned with the collector group.

Open Ports

By default, Data Collectors accept connections only to the following ports:

TCP 22 (SSH)
TCP 53 (DNS)
TCP 123 (NTP)
UDP 161 (SNMP)
UDP 162 (Inbound SNMP Trap)
UDP 514 (Inbound Syslog)
TCP 7700 (Web Configuration Utility)
TCP 7707 (one-way communication from the Database Server)

For increased security, all other ports are closed.

Viewing the List of Collector Groups

The Collector Groups page displays a list of all collector groups in your SL1 system.

For each collector group, the page displays the following:

ID. Unique numeric identifier automatically assigned by SL1 to each collector group.
Name. Name of the collector group.
Devices Count. Number of devices currently using the collector group for data collection.
Message Collectors. The name(s) of the Message Collector(s) (if any) associated with the collector group.
Data Collectors. The name(s) of the Data Collectors in the collector group.
Edit User. User who created or last edited the collector group.
Edit Date. Date and time the collector group was created or last edited.
Collector Failover. Indicates if Data Collector failover is enabled or disabled for the collector group.
Enable Concurrent SNMP Collection. Indicates if the collector group has concurrent SNMP collection enabled, disabled, or set to the systemwide default setting.
Enable Concurrent PowerShell Collection. Indicates if the collector group has concurrent PowerShell collection enabled, disabled, or set to the systemwide default setting.
Enable Concurrent Network Interface Collection. Indicates if the collector group has concurrent network interface collection enabled, disabled, or set to the systemwide default setting.
Collectors Available for Failover. The number of Data Collectors that must be available before a Data Collector failover can occur, if Data Collector failover is enabled for the collector group.
Failback Mode. Indicates if failback is automatic or manual, if Data Collector failover is enabled for the collector group.
Failover Delay (in minutes). The number of minutes SL1 should wait after a Data Collector outage before redistributing the data collection tasks among the other Data Collectors in the collector group, if Data Collector failover is enabled for the collector group.
Failback Delay (in minutes). The number of minutes SL1 should wait after the failed Data Collector is restored before redistributing data-collection tasks among the collector group, including the previously failed Data Collector, if Data Collector failover is enabled for the collector group.
Status. Indicates the Oracle Linux 8 (OL8) conversion status for the collector group.
Organization(s). The organization(s) to which the collector group is assigned.

The Organization(s) column displays only if multi-tenancy is enabled for collector groups. For more information about editing a collector group's organizations, see the section on Aligning Collector Groups to Organizations.

If you do not see one of these columns on the Collector Groups page, click the Select Columns icon () to add or remove columns. You can also drag columns to different locations on the page or click on a column heading to sort the list of collector groups by that column's values. SL1 retains any changes you make to the columns that appear on the Collector Groups page and will automatically recall those changes the next time you visit the page.

You can filter the items on this inventory page by typing filter text or selecting filter options in one or more of the filters found above the columns on the page. For more information, see Filtering Inventory Pages.

You can adjust the size of the rows and the size of the row text on this inventory page. For more information, see the section on Adjusting the Row Density.

Viewing the List of Collector Groups in the Classic SL1 User Interface

To view the list of collector groups in the classic SL1 user interface:

Go to the Collector Group Management page (System > Settings > Collector Groups).
The Collector Group Registry pane displays a list of all collector groups in your SL1 system. For each collector group, the Collector Group Management page displays the following:

Name. Name of the collector group.
ID. Unique numeric identifier automatically assigned by SL1 to each collector group.
Organization. The organization(s) to which the collector group is assigned. Click the organization icon () to edit the collector group's organizations.
The Organization column displays only if multi-tenancy is enabled for collector groups. For more information about editing a collector group's organizations, see the section on Aligning Collector Groups to Organizations in the Classic SL1 User Interface.
# Collectors. Number of Data Collectors in the collector group.
Msg Collector. Name of the Message Collector(s) (if any) associated with the collector group.
# Devices. Number of devices currently using the collector group for data collection.
Edit User. User who created or last edited the collector group.
Edit Date. Date and time the collector group was created or last edited.

Creating a Collector Group

Pre-Deployment Questions for a Collector Group

Consider the following questions before creating a new collector group. Your responses to these questions will help you determine how to create and name your new collector group:

Will your collector group span regionally close data centers and be configured for maximum resilience?
Will your users be required to know your collector group naming scheme, or will you provide a general collector group for them to use as a default (and use specialized collector groups for distinct use cases only)?
Will your collector group be structured for minimum latency to the monitored endpoints?
Consider the following questions about the resilience of your deployment:

What happens to the ability to monitor if a data center hosting an entire collector group goes offline?
Is the deployment resilient and will it perform well?
What is your failure mode? 100% > 0% or !00% > 50% > 0%?

Capacity Planning for a Collector Group

In addition to deciding on your resiliency strategy, look at your failure mode and determine if you are allocating sufficient capacity to achieve a 100% > 50% capacity degradation on a data center failure before failing completely at 0%.

Consider the number of devices in your collector group and the number of Data Collectors in your collector group to determine if you have overloaded Data Collectors or underpowered Data Collectors.

Defining a Collector Group

To define a new collector group:

Go to the Collector Groups page (Manage > Collector Groups).
Click the Add Collector Group button. The Add Collector Group modal appears.
On the Add Collector Group modal, complete the following fields:

Collector Group Name. Type a name for the collector group.
Virtual Collector Group (vCUG). Toggle this option on to make the collector group a virtual collector group. Virtual collector groups do not contain any Data Collectors or Message Collectors and SL1 does not collect any data from devices aligned with a virtual collector group. Instead of collecting data, virtual collector groups serve only as storage areas for historical data from decomissioned devices.
Generate Alert on Collector Outage. Toggle this option on to specify that the platform should generate an event if a Data Collector has an outage, or toggle it off if the platform should not generate an event if a Data Collector has an outage.
All current and future organizations. Toggle this option on to align the collector group to all of your SL1 organizations, or toggle it off to specify the organizations to which you want to assign the collector group.
Limit access to specific organizations. If you toggled off the All current and future organizations option, select the organization(s) to which you want to assign the collector group.
The All current and future organizations and Limit access to specific organizations fields display only if multi-tenancy is enabled for collector groups.
Message Collector Selection. Select one or more available Message Collectors from the drop-down list to add it to the collector group.
A single Message Collector can be used by multiple collector groups. When you align a single Message Collector with multiple collector groups, the single Message Collector might then be aligned with two devices (each in a separate collector group) that use the same primary IP address or the same secondary IP address. If this happens, SL1 will generate an event.
Data Collector Selection. Select one or more available Data Collectors from the drop-down list to add it to the collector group.
Concurrent SNMP Collection. Specifies whether you want to enable concurrent SNMP collection. Concurrent SNMP collection uses asynchronous input/output for massive concurrency with lower system resource requirements. This means that Data Collectors can collect more data using fewer system resources. Concurrent SNMP collection also prevents missed polls and data gaps because collection will execute more quickly. For the selected collector group, this field overrides the value in the Behavior Settings page (System > Settings > Behavior). Your choices are:
- Use Systemwide Default. The collector group will use the global settings for concurrent SNMP collection that has been configured on the Behavior Settings page (System > Settings > Behavior).
- Enabled. Concurrent SNMP collection is enabled on this collector group regardless of the global setting on the Behavior Settings page.
- Disabled. Concurrent SNMP collection is disabled on this collector group regardless of the global setting on the Behavior Settings page.
Concurrent PowerShell Collection. Specifies whether you want to enable concurrent PowerShell collection for this collector group. Concurrent PowerShell collection allows multiple collection tasks to run at the same time with lower system resource requirements. This means that Data Collectors can collect more data using fewer system resources. Concurrent PowerShell collection also prevents missed polls and data gaps because collection will execute more quickly. The PowerShell Collector is an independent service running as a container on a Data Collector. For the selected collector group, this field overrides the value in the Behavior Settings page (System > Settings > Behavior). Your choices are:
- Use Systemwide Default. The collector group will use the global settings for concurrent PowerShell collection that has been configured on the Behavior Settings page (System > Settings > Behavior).
- Enabled. Concurrent PowerShell collection is enabled on this collector group regardless of the global setting on the Behavior Settings page.
- Disabled. Concurrent PowerShell collection is disabled on this collector group regardless of the global setting on the Behavior Settings page.
Concurrent Network Interface Collection. Specifies whether you want to enable or disable concurrent network interface collection for this collector group. Concurrent network interface collection uses asynchronous SNMP collection for all network interfaces. This provides better scalability for large networks by allowing multiple collection tasks to run at the same time with a reduced load on Data Collectors. For the selected collector group, this field overrides the value in the Behavior Settings page (System > Settings > Behavior). Your choices are:
- Use Systemwide Default. The collector group will use the global settings for concurrent network interface collection that has been configured on the Behavior Settings page (System > Settings > Behavior).
- Enabled. Concurrent network interface collection is enabled on this collector group regardless of the global setting on the Behavior Settings page.
- Disabled. Concurrent network interface collection is disabled on this collector group regardless of the global setting on the Behavior Settings page.
Collector Failover. This option is available only if you have at least two Data Collectors in the collector group. Specifies whether you want to maximize the number of devices to be managed or whether you want to maximize reliability. Your choices are:
- Off (Maximize Manageable Devices). The collector group will be load-balanced only. At any given time, the Data Collector with the lightest load handles the next discovered device. If a Data Collector fails, no data will be collected from the devices aligned with the failed Data Collector until the failure is fixed.
- On (Maximize Reliability). The collector group will be load-balanced and configured as a high-availability system that minimizes downtime. If one or more Data Collectors should fail, the tasks from the failed Data Collector will be distributed among the other Data Collectors in the collector group. ScienceLogic recommends that you use this setting.
Collectors Available for Failover. This option is available only if you selected On (Maximize Reliability) in the Collector Failover field. Specifies the minimum number of Data Collectors that must be available (i.e., with a status of "Available [0]") before a Data Collector failover may occur.
- For collector groups with only two Data Collectors, this field will contain the value "1 collector".
- For collector groups with more than two Data Collectors, the field will contain values from a minimum of one half of the total number of Data Collectors up to a maximum of one less than the total number of Data Collectors. For example, for a collector group with eight Data Collectors, the possible values in this field would be 4, 5, 6, and 7.
- SL1 will never automatically increase the maximum number of Data Collectors that can fail in a collector group. For example, suppose you have a collector group with three Data Collectors. Suppose Collectors Available For Failover field is set to "2". If you add a fourth Data Collector to the collector group, SL1 will automatically set the Collectors Available For Failover field to "3" to maintain the maximum number of Data Collectors that can fail as "one". However, you can override this automatic setting by manually changing the value in the Collectors Available For Failover field.
Failback Mode. This option is available only if you selected On (Maximize Reliability) in the Collector Failover field. Specifies how you want collection to behave when the outage is fixed. Your choices are:
- Automatic. After the failed Data Collector is restored, SL1 will automatically redistribute data-collection tasks among the collector group, including the previously failed Data Collector. ScienceLogic recommends that you use this setting.
- Manual. After the failed Data Collector is restored, you will manually prompt Data Collector to redistribute data-collection tasks.
Failover Delay (minutes). This option is available only if you selected On (Maximize Reliability) in the Collector Failover field. Specifies the number of minutes SL1 should wait after the outage of a Data Collector before redistributing the data-collection tasks among the other Data Collectors in the group. During this time, data will not be collected from the devices aligned with the failed Data Collector(s). The default minimum value for this field is 5 minutes. ScienceLogic recommends that you set this field to 15 minutes.
Failback Delay (minutes). This option is available only if you selected On (Maximize Reliability) in the Collector Failover field and Automatic in the Failback Mode field. Specifies the number of minutes SL1 should wait after the failed Data Collector is restored before redistributing data-collection tasks among the collector group, including the previously failed Data Collector. The default minimum value for this field is 5 minutes. ScienceLogic recommends that you set this field to 15 minutes.

Click Save.

Defining a Collector Group in the Classic SL1 User Interface

To define a new collector group in the classic SL1 user interface:

Go to the Collector Group Management page (System > Settings > Collector Groups).
In the Collector Group Management page, click the Reset button to clear the values from the fields in the top pane.
Go to the top pane and enter values in the following fields:

Collector Group Name. Name of the collector group.

Collector Failover. Specifies whether you want to maximize the number of devices to be managed or whether you want to maximize reliability. Your choices are:

Off (Maximize Manageable Devices). The collector group will be load-balanced only. At any given time, the Data Collector with the lightest load handles the next discovered device. If a Data Collector fails, no data will be collected from the devices aligned with the failed Data Collector until the failure is fixed.
On (Maximize Reliability). The collector group will be load-balanced and configured as a high-availability system that minimizes downtime. If one or more Data Collectors should fail, the tasks from the failed Data Collector will be distributed among the other Data Collectors in the collector group. ScienceLogic recommends that you use this setting.

Generate Alert on Collector Outage. Specifies whether or not the platform should generate an event if a Data Collector has an outage. ScienceLogic recommends that you select Yes for this setting.
Enable Concurrent SNMP Collection. Specifies whether you want to enable Concurrent SNMP Collection. Concurrent SNMP Collection uses asynchronous I/O for massive concurrency with lower system resource requirements. This means that Data Collectors can collect more data using fewer system resources. Concurrent SNMP Collection also prevents missed polls and data gaps because collection will execute more quickly. For the selected collector group, this field overrides the value in the Behavior Settings page (System > Settings > Behavior). Your choices are:

Use systemwide default. The collector group will use the global settings for Concurrent SNMP Collection configured in the Behavior Settings page (System > Settings > Behavior).
No. Concurrent SNMP Collection is disabled on this collector group regardless of the global setting on the Behavior Settings page.
Yes. Concurrent SNMP Collection is enabled on this collector group regardless of the global setting on the Behavior Settings page.
If the "Data Collection: SNMP Collector" process is disabled on the Process Manager page (System > Settings > Admin Processes), this concurrent SNMP collection option is disabled.

Concurrent SNMP Collection is not available in military unique deployments (MUD) or STIG-compliant deployments.

Enable Concurrent PowerShell Collection. Specifies whether you want to enable Concurrent PowerShell Collection for this collector group. If you make no selection, the default behavior is to "Use systemwide default", which uses the global setting specified on the Behavior Settings page (System > Settings > Behavior). Your choices are:

Use systemwide default. The collector group will use the global setting for Concurrent PowerShell Collection as it is configured on the Behavior Settings page (System > Settings > Behavior).
No. Concurrent PowerShell Collection is disabled on this collector group regardless of the global setting on the Behavior Settings page.
Yes. Concurrent PowerShell Collection is enabled on this collector group regardless of the global setting on the Behavior Settings page.
If the "Data Collection: PowerShell Collector" process is disabled on the Process Manager page (System > Settings > Admin Processes), this concurrent PowerShell collection option is disabled.
Concurrent PowerShell Collection is not available in military unique deployments (MUD) or STIG-compliant deployments.

Enable Concurrent Network Interface Collection. Specifies whether you want to enable or disable Concurrent Network Interface Collection for this collector group. If you make no selection, the default behavior is to "Use systemwide default", which uses the global setting specified on the Behavior Settings page (System > Settings > Behavior). Your choices are:

Use systemwide default. The collector group will use the global setting for Concurrent Network Interface Collection as it is configured in the Behavior Settings page (System > Settings > Behavior).
No. Concurrent Network Interface Collection is disabled on this collector group regardless of the global setting on the Behavior Settings page.
Yes. Concurrent Network Interface Collection is enabled on this collector group regardless of the global setting on the Behavior Settings page.

If the "Data Collection: SNMP Collector" process is disabled on the Process Manager page (System > Settings > Admin Processes), this concurrent network interface collection option is disabled.

Concurrent network interface collection is not available in military unique deployments (MUD) or STIG-compliant deployments.

Collector Selection. Displays a list of available Data Collectors.

To assign an available Data Collector server to the collector group, simply highlight it. You can assign one or more Data Collectors to a collector group.
To assign multiple Data Collectors to the collector group, hold down the <Ctrl> key and click multiple Data Collectors.

Message Collector. Displays a list of available Message Collectors.

To assign an available Message Collector to the collector group, simply highlight it. You can assign one or more Message Collectors to a collector group.
To assign multiple Message Collectors to the collector group, hold down the <Ctrl> key and click multiple Message Collectors.

NOTE: A single Message Collector can be used by multiple collector groups. When you align a single Message Collector with multiple collector groups, the single Message Collector might then be aligned with two devices (each in a separate collector group) that use the same primary IP address or the same secondary IP address. If this happens, SL1 will generate an event.

Collectors Available for Failover. Applies only if you selected "On (Maximize Reliability)" in the Collector Failover field. Specifies the minimum number of Data Collectors that must be available (i.e. with a status of "Available [0]") before a Data Collector failover may occur.

For collector groups with only two Data Collectors, this field will contain the value "1 collector".
For collector groups with more than two Data Collectors, the field will contain values from a minimum of one half of the total number of Data Collectors up to a maximum of one less than the total number of Data Collectors.
For example, for a collector group with eight Data Collectors, the possible values in this field would be 4, 5, 6, and 7.
SL1 will never automatically increase the maximum number of Data Collectors that can fail in a collector group. For example, suppose you have a collector group with three Data Collectors. Suppose Collectors Available For Failover field is set to "2". If you add a fourth Data Collector to the collector group, SL1 will automatically set the Collectors Available For Failover field to "3" to maintain the maximum number of Data Collectors that can fail as "one". However, you can override this automatic setting by manually changing the value in the Collectors Available For Failover field.

If you set this to half of your available Data Collectors and a 50% Data Collector outage occurs and the remaining Data Collectors are down by one, no rebalance will occur. If you specify one-third of the total number of Data Collectors, then a rebalance will be attempted until your overall capacity falls below one-third of your Data Collectors, thereby maximizing your resiliency but minimizing the opportunity for your system to enter an unproductive rebalancing loop.

If the number of available Data Collectors is less than the value in the Collectors Available For Failover field, SL1 will not failover within the collector group. SL1 will not collect any data from the devices aligned with the failed Data Collector(s) until the failure is fixed on enough Data Collector(s) to equal the value in the Collectors Available For Failover field. SL1 will generate a critical event.

Failback Mode. Applies only if you selected On (Maximize Reliability) in the Collector Failover field. Specifies how you want collection to behave when the outage is fixed. You can specify one of the following:

Automatic. After the failed Data Collector is restored, SL1 will automatically redistribute data-collection tasks among the collector group, including the previously failed Data Collector. ScienceLogic recommends that you use this setting.
Manual. After the failed Data Collector is restored, you will manually prompt Data Collector to redistribute data-collection tasks by clicking the lightning bolt icon () for the collector group.

Failover Delay (minutes). Applies only if you selected On (Maximize Reliability) in the Collector Failover field. Specifies the number of minutes SL1 should wait after the outage of a Data Collector before redistributing the data-collection tasks among the other Data Collectors in the group. During this time, data will not be collected from the devices aligned with the failed Data Collector(s). The default minimum value for this field is 5 minutes. ScienceLogic recommends that you set this field to 15 minutes.
Failback Delay (minutes). Applies only if you selected On (Maximize Reliability) in the Collector Failover field and Automatic in the Failback Mode field. Specifies the number of minutes SL1 should wait after the failed Data Collector is restored before redistributing data-collection tasks among the collector group, including the previously failed Data Collector. The default minimum value for this field is 5 minutes. ScienceLogic recommends that you set this field to 15 minutes.

Click the Save button to save the new collector group.

To assign devices to the collector group, see the section on Aligning Single Devices with a Collector Group in the Classic SL1 User Interface and the section on Aligning a Device Group with a Collector Group.

Editing a Collector Group

To edit a collector group:

Go to the Collector Groups page (Manage > Collector Groups).
Click the Actions icon () of the collector group you want to edit and then select Edit. The Edit Collector Group modal appears.
The fields in the Edit Collector Group modal are populated with values from the selected collector group. You can edit one or more of the fields. For a description of each field, see the section on Defining a Collector Group.
Click Save to save any changes to the collector group.

Editing a Collector Group in the Classic SL1 User Interface

From the Collector Group Management page, you can edit an existing collector group. You can add or remove Data Collectors and change the configuration from load-balanced to failover (high availability).

To edit a collector group in the classic SL1 user interface:

Go to the Collector Group Management page (System > Settings > Collector Groups).
In the Collector Group Management page, go to the Collector Group Registry pane at the bottom of the page.
Find the collector group you want to edit. Click its wrench icon ().
The fields in the top pane are populated with values from the selected collector group. You can edit one or more of the fields. For a description of each field, see the section on Defining a Collector Group in the Classic SL1 User Interface.
Click the Save button to save any changes to the collector group.

Collector Groups and Load Balancing

To perform initial discovery, SL1 uses a single, selected Data Collector from the collector group. This allows you to troubleshoot discovery if there are any problems.

After each discovered device is modeled (that is, after SL1 assigns a device ID and creates the device in the database), SL1 distributes devices among the Data Collectors in the collector group. The newest device is assigned to the Data Collector currently managing the lightest load.

This process is known as Collector load balancing, and it ensures that the work performed by the Dynamic Applications aligned to the devices is evenly distributed across the Data Collectors in the collector group.

SL1 performs Collector load balancing in the following circumstances:

A new Data Collector is added to a collector group
New devices are discovered
Failover or failback occurs within a collector group (if failover is enabled)
A user clicks the lightning bolt icon () for a collector group to manually force redistribution
Devices in DCM or DCM-R trees will be loaded on the Data Collector currently assigned to the DCM or DCM-R tree rather than being distributed across the collector group. DCM or DCM-R trees will be rebalanced as an aggregate when rebalancing occurs to an available Data Collector with sufficient capacity to sustain the load.

Whenever a device is load-balanced from one Data Collector to another, whether due to failover or regular load balancing, the device state information is not transferred to the new Data Collector.

The lightning bolt icon () appears only for collector groups that contain more than one Data Collector. For collector groups with only one Data Collector, this icon is grayed out. This icon does not appear for All-In-One Appliances.

When all of the devices in a collector group are redistributed, SL1 will assign the devices to Data Collectors so that all Data Collectors in the collector group will spend approximately the same amount of time collecting data from devices.

Collector load balancing uses two metrics:

Device Rating. A device's rating is the total elapsed time consumed by either 1) all of the Dynamic Applications aligned to the device, or 2) collecting metrics from the device's interfaces, whichever is greater. A Collector's load is the sum of the ratings of the devices assigned to the Collector. The balancer tries to evenly divide the work performed by Collectors by assigning devices to Collectors using the device ratings and Collector loads.
Collector Load. The sum of the device ratings for all of the devices assigned to a collector.

SL1 performs the following steps during Collector load balancing:

Searches for all devices that are not yet assigned to a collector group.
Determines the load on each Data Collector by calculating the device rating for each device on a Data Collector and then summing the device ratings.
Determines the number of new devices (less than one day old) and old devices on each Data Collector.
On each Data Collector, calculates the average device rating for old devices (sum of the device ratings for all old devices divided by the number of old devices). If there are no old devices, sets the average device rating to "1" (one).
On each Data Collector, assigns the average device rating to all new devices (devices less than one day old).
Assigns each unassigned device (either devices that are not yet assigned or devices on a failed Data Collector) to the Data Collector with the lightest load. Add each newly assigned device rating to the total load for the Data Collector.

Tuning Collector Groups in the silo.conf File

With the addition of execution environments to SL1, SL1 sorts data collections in to a two-process-pool model.

SL1 sorts collection requests into groups by execution environment. These groups of collection requests are called "chunks". Each chunk contains a maximum of 200 collection requests, all of which use the same execution environment. SL1 sends each chunk to a chunk worker.

The chunk worker determines the appropriate execution environment for the chunk, deploys the execution environment, and starts a pool of request workers in the execution environment.

The request workers then process the actual collection requests contained in the chunks and perform the actual data collection.

For more information about ScienceLogic Libraries and execution environments, see the section on ScienceLogic Libraries and Execution Environments.

The following settings are available in the master.system_settings_core database table for tuning globally in a stack, or in the Silo.Conf file for tuning locally on a single Data Collector:

Parameter Name	Description	Runtime Default
dynamic_collect_num_chunk_workers	The number of chunk workers. In general, this value controls the number of PowerPacks that can be processed in parallel.	2
dynamic_collect_num_request_workers	The maximum number of request workers in each worker pool. In general, this value controls the number of collections within a PowerPack that can be processed in parallel.	"2" or the number of cores on the Data Collector, whichever is greater
dynamic_collect_request_chunk_size	The maximum number of collection requests in a chunk. This value controls how many collections are processed by each pool of requests workers.	200

The database values for these parameters are "Null" by default, which specifies that SL1 should use the runtime defaults.

The maximum total number of worker processes used during a scheduled collection is generally dynamic_collect_num_chunk_workers X dynamic_collect_num_request_workers.

There might be circumstances where adjustment is necessary to improve the performance of collection.

Example 1: Additional Environments Required

You might need to adjust the values of the collection processes when scheduled collection requires more than two environments.

Because the default number of chunk workers is "2", SL1 can simultaneously process chunks of collection requests for a maximum of two virtual environments. If the collection requests require more than two virtual environments, you can increase parallelism by setting dynamic_collect_num_chunk_workers to match the number of environments.

If you increase dynamic_collect_num_chunk_workers, you might want to decrease dynamic_collect_num_request_workers to avoid performance problems caused by too many request workers.

If you cannot increase dynamic_collect_num_chunk_workers because doing so would result in too many request workers, you can decrease dynamic_collect_request_chunk_size to give collection requests for each environment a "fairer share" of the chunk workers.

Smaller chunk sizes require more resources to establish the virtual environments and establish more polls of request workers to process the chunks. Conversely, if you want to use fewer resources for establishing virtual environments and creating pools of request worker pools, and you want to use more resources for collection itself, increasing dynamic_collect_request_chunk_size allows more collection requests to be processed by each pool of request workers.

Example 2: Input/Output Bound Collections

You might need to adjust the values of the collection processes when collection requests are input/output (I/O) bound with relatively large latencies.

In this scenario, you can increase dynamic_collect_num_request_workers to improve parallelism. If you increase dynamic_collect_num_request_workers, you might want to decrease dynamic_collect_num_chunk_workers to avoid performance problems caused by too many request workers.

Increasing the number of collection processes will increase CPU and memory utilization on the Data Collector, so be careful when increasing the values dramatically.

Before adjusting dynamic_collect_num_request_workers, you need to know the following information:

The number of CPU cores in the Data Collector
The current CPU utilization of Data Collector
The current memory utilization of Data Collector

Start by setting dynamic_collect_num_request_workers to equal the number of CPUs plus 50%. For example: with 8 cores, start by setting dynamic_collect_num_request_workers to 12. If that is insufficient, you can then try 16, 20, 24, and so forth.

If data collections are terminating early, it means that collections are not completed within the 15-minute limit. If this is the case, wait 30 minutes to see results after adjusting the collection values.

Load Balancing and Device State

It is important to note that, whenever a device is load-balanced from one Data Collector to another, whether due to failover or regular load balancing, the device state information is not transferred to the new Data Collector.

In the time immediately after load balancing, if an interface or another aspect of the device changes state at the same time as the device is load-balanced, then the new Data Collector might not register an event that triggers based on a device state change.

Also, if a Dynamic Application that depends on cached data being present attempts to collect data, but the cached data is not yet present on the new Data Collector, the Dynamic Application might initially fail to collect data.

Collector Affinity

Collector Affinity specifies the Data Collectors that are allowed to run collection for Dynamic Applications aligned to component devices. You can define Collector Affinity for each Dynamic Application. Choices are:

Default. If the Dynamic Application is auto-aligned to a component device during discovery, then the Data Collector assigned to the root device will collect data for this Dynamic Application as well. For devices that are not component devices, the Data Collector assigned to the device running the Dynamic Application will collect data for the Dynamic Application.
Root Device Collector. The Data Collector assigned to the root device will collect data for the Dynamic Application. This guarantees that Dynamic Applications for an entire DCM tree will be collected by a single Data Collector. You might select this option if:

The Dynamic Application has a cache dependency with one or more other Dynamic Applications.
You are unable to collect data for devices and Dynamic Applications within the same Device Component Map on multiple Data Collectors in a collector group.
The Dynamic Application will consume cache produced by a Dynamic Application aligned to a non-root device (for instance, a cluster device).
The Dynamic Application includes snippet code with a root_device tuple.

Assigned Collector. The Dynamic Application will use the Data Collector assigned to the device running the Dynamic Application. This allows Dynamic Applications that are auto-aligned to component devices during discovery to run on multiple Data Collectors. This is the default setting. You might select this option if:

The Dynamic Application has no cache dependencies with any other Dynamic Applications.
You want the Dynamic Application to be able to make parallel data requests across multiple Data Collectors in a collector group.
The Dynamic Application can be aligned using mechanisms other than auto-alignment during discovery (for instance, manual alignment or alignment via Device Class Templates or Run Book Actions).

Failover for Collector Groups for Component Devices

If you specified Default or Root Device Collector for Dynamic Applications, and the single Data Collector in the collector group for component devices fails, users must create a new collector group with a single Data Collector and manually move the devices from the failed collector group to the new collector group. For details on manually moving devices to a new collector group, see the section on Changing the Collector Group for One or More Devices.

Collector Groups for Merged Devices

You can merge a physical device and a component device. There are two ways to do this:

From the Actions menu in the Device Properties page (Devices > Classic Devices > wrench icon) for either the physical device or the component device.
From the Actions menu in the Device Manager page (Devices > Classic Devices, or Registry > Devices > Device Manager in the classic SL1 user interface), select Merge Devices to merge devices in bulk.

You can unmerge a component device from a physical device. You can do this in two ways:

From the Actions menu in the Device Properties page (Devices > Classic Devices > wrench icon) for either the physical device or the component device, select Unmerge Devices to unmerge devices.
From the Actions menu in the Device Manager page Devices > Classic Devices, or Registry > Devices > Device Manager in the classic SL1 user interface), select Unmerge Devices to unmerge devices in bulk.

When you merge a physical device and a component device, the device record for the component device is no longer displayed in the user interface; the device record for the physical device is displayed in user interface pages that previously displayed the component device. For example, the physical device is displayed instead of the component device in the Device Components page (Devices > Device Components) and the Component Map page (Device Component Map). All existing and future data for both devices will be associated with the physical device.

If you manually merge a component device with a physical device, SL1 allows data for the merged component device and data from the physical device to be collected on different Data Collectors. Data that was aligned with the component device can be collected by the collector group for its root device. Data aligned with the physical device can be collected by a different collector group.

NOTE: You can merge a component device with only one physical device.

Creating a Collector Group for Data Storage Only

You can create a virtual collector group (vCUG) that serves as a storage area for all historical data from decommissioned devices.

The virtual collector group will store all existing historical data from all aligned devices, but will not perform collection on those devices. The virtual collector group will not contain any Data Collectors or any Message Collectors. SL1 will stop collecting data from devices aligned with a virtual collector group.

To define a virtual collector group in the default SL1 user interface:

Go to the Collector Groups page (Manage > Collector Groups).
Click the Add Collector Group button. The Add Collector Group modal appears.
On the Add Collector Group modal, complete the following fields:

Collector Group Name. Type a name for the collector group.
Virtual Collector Group (vCUG). Toggle this option on to make the collector group a virtual collector group. Virtual collector groups do not contain any Data Collectors or Message Collectors and SL1 does not collect any data from devices aligned with a virtual collector group. Instead of collecting data, virtual collector groups serve only as storage areas for historical data from decomissioned devices.

Leave all other fields set to the default values. Do not include any Data Collectors or Message Collectors in the collector group.
Click Save.
To assign devices to the virtual collector group, see the section on aligning single devices with a collector group and the section on aligning a device group with a collector group.

To define a virtual collector group in the classic SL1 user interface:

Go to the Collector Group Management page (System > Settings > Collector Groups).
In the Collector Group Management page, click the Reset button to clear values from the fields in the top pane.
Go to the top pane and enter a name for the virtual collector group in the Collector Group Name field.
Leave all other fields set to the default values. Do not include any Data Collectors or Message Collectors in the collector group.
Click the Save button to save the new collector group.
To assign devices to the virtual collector group, see the section on aligning single devices with a collector group and the section on aligning a device group with a collector group.

Deleting a Collector Group

To delete a collector group:

Go to the Collector Groups page (Manage > Collector Groups).
Click the Actions icon () of the collector group you want to delete and then select Delete.

Deleting a Collector Group in the Classic SL1 User Interface

From the Collector Group Management page, you can delete a Collector Group. When you delete a collector group, those Data Collectors become available for use in other collector groups.

NOTE: Before you can delete a collector group, you must move all aligned devices to another collector group. For details on how to do this, see the section Changing the Collector Group for One or More Devices.

To delete a collector group in the classic SL1 user interface:

Go to the Collector Group Management page (System > Settings > Collector Groups).
In the Collector Group Management page, go to the Collector Group Registry pane at the bottom of the page.
Find the collector group you want to delete. Click its delete icon ().

Assigning a Collector Group for a Single Device

After you have defined a collector group, you can align devices with that collector group.

To assign a collector group to a device:

From the Devices page, click the name of the device that you want to assign to a collector group. The Device Investigator page opens for that device.
On the Device Investigator page, click the Settings tab.
Click the Edit button. This enables you to change your device settings.
In the Collection Poller field, select the name of the collector group that you want to use for collection on the device.
Click Save.

Assigning a Collector Group for a Single Device in the Classic SL1 User Interface

After you have defined a collector group, you can align devices with that collector group.

To assign a collector group to a device in the classic SL1 user interface:

Go to the Device Manager page (Devices > Classic Devices, or Registry > Devices > Device Manager in the classic SL1 user interface).
In the Device Manager page, find the device you want to edit. Click its wrench icon (). The Device Properties page appears.
On the Device Properties page, select a collector group from the Collection fields.
Click the Save button to save the change to the device.

Aligning the Collector Group in a Device Template

You can specify a collector group in a device template. Then, when you apply the device template to a device, either through discovery or when you apply the device template to a device group or selection of devices, the specified collector group is automatically associated with the device(s). Optionally, you can later edit the collector group for each device.

For more details on device templates and device groups, see the Device Groups and Device Templates section.

Changing the Collector Group for One or More Devices

You can change the collector group for multiple devices simultaneously. This is helpful if you want to reorganize devices or collector groups. If you want to delete a collector group, you first must first move each aligned device to another collector group. In this situation, you might want to change the collector group for multiple devices simultaneously.

To change the collector group for multiple device simultaneously:

Go to the Device Manager page (Devices > Classic Devices, or Registry > Devices > Device Manager in the classic SL1 user interface).
In the Device Manager page, click on the heading for the Collection Group column to sort the list of devices by collector group.
Select the checkbox for each device that you want to move to a different collector group.
In the Select Action field (in the lower right), go to Change Collector Group and select a collector group.
Click the Go button. The selected devices will now be aligned with the selected collector group.

Managing the Host Files for a Collector Group

The Host File Entry Manager page allows you to edit and manage host files for all of your Data Collectors from a single page in the SL1 system. When you create or edit an entry in the Host File Entry Manager page, SL1 automatically sends an update to every Data Collector in the specified collector group.

The Host File Entry Manager page is helpful when:

The SL1 system does not reside in the end-customer's domain
The SL1 system does not have line-of-sight to an end-customer's DNS service
A customer's DNS service cannot resolve a host name for a device that the SL1 system monitors

For details, see the section on Managing Host Files.

Processes for Collector Groups

For troubleshooting and debugging purposes, you might find it helpful to understand the ScienceLogic processes that affect a collector group.

NOTE: You can view the list of all processes and details for each process in the Process Manager page (System > Settings > Admin Processes).

The Enterprise Database: Collector Task Manager process (em7_ctaskman) process distributes devices between Data Collectors in a collector group, to load-balance the collection tasks. The process runs every 60 seconds and also checks the license on each Data Collector. The "Enterprise Database: Collector Task Manager" process (em7_ctaskman.py) redistributes devices between collectors when:

A collector group is created.
A new Data Collector is added to a collector group.
Failover or failback occurs within a collector group.
A user clicks on the lightning bolt icon () for a collector group, to manually force redistribution.

The Enterprise Database: Collector Data Pull processes retrieves information from each Data Collector in a collector group. The process pulls data from the in_storage tables on each Data Collector. The retrieved information is stored in the Database Server.

Enterprise Database: Collector Data Pull, High F (em7_hfpulld). Retrieves data from each Data Collector every 15 seconds (configurable).
Enterprise Database: Collector Data Pull, Low F (em7_lfpulld). Retrieves data from each Data Collector every five minutes.
Enterprise Database: Collector Data Pull, Medium (em7_mfpulld). Retrieves data from each Data Collector every 60 seconds.

The Enterprise Database: Collector Config Push process (config_push.py) updates each Data Collector with information on system configuration, configuration of Dynamic Applications, and any new or changed policies. This process runs once every 60 seconds and checks for differences between the configuration tables on the Database Server and the configuration tables on each Data Collector. The list of tables to be synchronized is stored in master.definitions_collector_config_tables on the Database Server.
Asynchronous Processes (for example, discovery or programs run from the Device Toolbox page). Asynchronous processes need to be run immediately and cannot wait until the "Enterprise Database: Collector Config Push" process (config_push.py) runs and tells the Data Collector to run the asynchronous process. Therefore, SL1 uses a stored procedure and the "EM7 Core: Task Manager" process (em7) to trigger asynchronous processes on both the Database Server and Data Collector.

If a user requests an asynchronous process, a stored procedure on the Database Server inserts a new row in the table master_logs.spool_process on the Database Server.
Every three seconds, the "EM7 Core: Task Manager" process (proc_mgr.py) checks the table master_logs.spool_process on the Database Server for new rows.
If the asynchronous process needs to be started on a Data Collector, a stored procedure on the Database Server inserts the same row into the table master_logs.spool_process on the Data Collector.
Every three seconds, the "EM7 Core: Task Manager" process (em7) checks the table master_logs.spool_process on the Data Collector for new rows.
If the "EM7 Core: Task Manager" process (em7) on the Data Collector finds a new row, the specified asynchronous process is executed on the Data Collector.
If a Database Server becomes the passive node in a High Availability configuration, the "EM7 Core: Task Manager" process (em7) stops core services.

Enabling and Disabling Concurrent PowerShell for Collector Groups

To improve the process of collecting data via PowerShell, you can enable Concurrent PowerShell Collection. Concurrent PowerShell Collection allows multiple collection tasks to run at the same time with a reduced load on Data Collectors. Concurrent PowerShell Collection also prevents missed polls and data gaps because collection will execute more quickly. As a result, Data Collectors can collect more data using fewer system resources.

When you use the PowerShell Collector for Concurrent PowerShell Collection, the collection process can bypass failed or paused collections, reduce collection time, and reduce the number of early terminations (sigterms) that occur with data collection. The PowerShell Collector is an independent service running as a container on a Data Collector.

You can enable one or more collector groups to use concurrent PowerShell collection, and you can collect metrics for concurrent PowerShell collection.

Concurrent PowerShell Collection is for PowerShell Performance and Performance Configuration Dynamic Application types and does not include Snippet Dynamic Applications which happen to run PowerShell commands.

Concurrent PowerShell Collection is not available in military unique deployments (MUD) or STIG-compliant deployments.

For more details on concurrent PowerShell collection, see the section on Concurrent PowerShell in the manual Monitoring Windows Systems with PowerShell.

Enabling Concurrent PowerShell on All Collector Groups

To enable concurrent PowerShell collection service for all collector groups:

Go to the Database Tool page (System > Tools > DB Tool).

The Database Tool page is available only in versions of SL1 prior to 12.2.1 and displays only for users that have sufficient permissions to access the page.
Enter the following in the SQL Query field:

INSERT INTO master.system_custom_config (`field`, `field_value`) VALUES ('enable_powershell_service', '1');

Disabling Concurrent PowerShell on All Collector Groups

To disable concurrent PowerShell collection service for all collector groups:

Go to the Database Tool page (System > Tools > DB Tool).

The Database Tool page is available only in versions of SL1 prior to 12.2.1 and displays only for users that have sufficient permissions to access the page.

Enter the following in the SQL Query field:

UPDATE master.system_custom_config SET field_value=0 where field='enable_powershell_service';

Enabling Concurrent PowerShell on a Specific Collector Group

To enable concurrent PowerShell collection for a specific collector group:

Go to the Database Tool page (System > Tools > DB Tool).

The Database Tool page is available only in versions of SL1 prior to 12.2.1 and displays only for users that have sufficient permissions to access the page.
Enter the following in the SQL Query field:

INSERT INTO master.system_custom_config (`field`, `field_value`, `cug_filter`) VALUES ('enable_powershell_service_CUGx', '1', 'collector_group_ID');

where:

collector_group_ID is the collector group ID. You can find this value in the Collector Group Management page (System > Settings > Collector Groups).

Disabling Concurrent PowerShell on a Specific Collector Group

To disable concurrent PowerShell collection for a specific collector group:

Go to the Database Tool page (System > Tools > DB Tool).

The Database Tool page is available only in versions of SL1 prior to 12.2.1 and displays only for users that have sufficient permissions to access the page.

Enter the following in the SQL Query field:

UPDATE master.system_custom_config SET field_value=0 where field='enable_powershell_service_CUGx';

where:

collector_group_ID is the collector group ID. You can find this value in the Collector Group Management page (System > Settings > Collector Groups).

Enabling and Disabling Concurrent SNMP for Collector Groups

To increase the scale for SNMP collection, you can enable Concurrent SNMP Collection. Concurrent SNMP Collection uses the standalone container called the SL1 SNMP Collector.

The SNMP Collector is an independent service that runs as a container on a Data Collector. When you enable Concurrent SNMP Collection, each Data Collector will contain four (4) SNMP Collector containers.

On each Data Collector, SL1 will restart each of the SNMP Collector containers periodically to ensure that each container remains healthy. When one SNMP Collector container is restarted, the other three SNMP Collector containers continue to handle the workload.

With Concurrent SNMP Collection, SNMP collection tasks can run in parallel. A single failed task will not prevent other tasks from completing.

Concurrent SNMP Collection provides:

Improved throughput for SNMP Dynamic Applications
Reduced use of resources on each Data Collector
More dependable collection from high-latency Devices

Concurrent SNMP Collection is not available in military unique deployments (MUD) or STIG-compliant deployments.

Enabling and Disabling Concurrent SNMP for All Collector Groups

This feature is disabled by default.

To enable Concurrent SNMP Collection in SL1:

Go to the Behavior Settings page (System > Settings > Behavior).
Check the Enable Concurrent SNMP Collection field.
Click Save.

If the "Data Collection: SNMP Collector" process is disabled on the Process Manager page (System > Settings > Admin Processes), this concurrent SNMP collection option is disabled.

If you do not want all of your SL1 Collectors to use Concurrent SNMP Collection, you can specify which Collector Units should use it in Enabling a Collector Group to Use Concurrent SNMP Collection.

Enabling and Disabling Concurrent SNMP for Collector Groups

Depending on the needs of your SL1 environment, you can enable or prevent a collector group from using concurrent SNMP collection.

To enable Concurrent SNMP Collection with a SL1 collector group:

Go to the Collector Group Management Page (System > Settings > Collector Groups).
Click the wrench icon () for the collector group you want to edit. The fields at the top of the page are updated with the data for that collector group.
Select an option in the Enable Concurrent SNMP Collection drop-down field:
- Use system-wide default. Select this option if you want this collector group to use or not use Concurrent SNMP Collection based on the Enable Concurrent SNMP Collection field on the Behavior Settings page. This is the default.
- Yes. Select this option to enable Concurrent SNMP Collection for this collector group, even if you did not enable it on the Behavior Settings page.
- No. Select this option to prevent this collector group from using Concurrent SNMP Collection, even if you did enable it on the Behavior Settings page.

If the "Data Collection: SNMP Collector" process is disabled on the Process Manager page (System > Settings > Admin Processes), this concurrent SNMP collection option is disabled.

Update the remaining fields as needed, and then click Save.

Enabling Multi-tenancy for Collector Groups

To support multi-tenancy, SL1 allows you to align collector groups with one, multiple, or all organizations in SL1. When you align an organization to a collector group, you control who can view details about that collector group and who can apply the collector group in SL1.

By default, newly created collector groups are aligned to all organizations. However, you can update the organization setting for a collector group if you have multi-tenancy enabled for collector groups.

When multi-tenancy is enabled:

An administrative user can update the organization alignment for all collector groups.
Non-administrative users can update all collector groups that are aligned to all organizations or the organizations to which the user belongs.

If you enable multi-tenancy for collector groups, you might encounter a situation where a device is not aligned to a collector group if they do not belong to the same organization.

If you enable multi-tenancy for collector groups, ScienceLogic strongly recommends that you do not disable it at a later date.

To enable multi-tenancy for collector groups:

Go to the Database Tool page (System > Tools > DB Tool).

The Database Tool page is available only in versions of SL1 prior to 12.2.1 and displays only for users that have sufficient permissions to access the page.
Select "master" as the database.
Type the following in the SQL Query field:

UPDATE master.system_settings_core SET enable_cug_orgs=1
Click Go.

Aligning Collector Groups to Organizations

To align existing collector groups to organizations:

Go to the Collector Groups page (Manage > Collector Groups).
Click the Actions icon () of the collector group you want to edit and then select Edit. The Edit Collector Group modal appears.
On the Edit Collector Group modal, complete the following fields:

All current and future organizations. Toggle this option on to align the collector group to all of your SL1 organizations, or toggle it off to specify the organizations to which you want to assign the collector group.
Limit access to specific organizations. If you toggled off the All current and future organizations option, select the organization(s) to which you want to assign the collector group.

Click Save to save any changes to the collector group.

Aligning Collector Groups to Organizations in the Classic SL1 User Interface

To align existing collector groups to organizations in the classic SL1 user interface:

Go to the Collector Group Management page (System > Settings > Collector Groups).
In the Collector Group Registry pane, click the Organization icon () of the collector group you want to align to an organization. The Align Organizations modal appears.
In the Align Organizations modal, complete the following fields:

Collector Group Availability. Select All Organizations to align the collector group to all of your SL1 organizations, or select Aligned Organizations Only to specify the organizations to which you want to assign the collector group.
Aligned Organizations. If you selected Aligned Organizations Only in the Collector Group Availability field, select the organization(s) to which you want to assign the collector group.

Click Save.