Updating SL1

Download this manual as a PDF file

This section provides an overview of the System Updates page, detailed steps for performing an SL1 upgrade, and detailed steps on upgrading MariaDB, upgrading PowerPacks, and performing reboots.

Use the following menu options to navigate the SL1 user interface:

  • To view a pop-out list of menu options, click the menu icon ().
  • To view a page containing all of the menu options, click the Advanced menu icon ().

The following video describes the improvements to upgrading to the latest version of SL1:

The System Updates Page

The System Updates page (System > Tools > Updates) allows you to update the software on your SL1 appliances.

You must first download the update file to the local computer.

You can then import the software update through the user interface.

After you import a software update to your SL1 system, the SL1 system can automatically stage the software update. Staging is when the software is copied to each ScienceLogic appliance. Staging allows SL1 to simultaneously apply the software changes to each ScienceLogic appliance, regardless of the speed of the connection to each ScienceLogic appliance.

You can allow the SL1 system to automatically stage the software or you can manually stage the software.

After the software update is staged, you can deploy the software.

To apply updates to an existing Data Collector, that Data Collector must be a member of a Collector Group. In some SL1 systems, users might have to create a Collector Group for a single Data Collector before applying updates.

To conserve disk space on Data Collectors and Message Collectors, after an update, SL1 removes previous Docker images.

The Workflow for Updating SL1

The following sections describe the steps to plan and deploy an SL1 update. If would like assistance planning an upgrade path that minimizes downtime, please contact your Customer Success Manager.

The workflow for updating SL1 is:

  1. Plan the update.
  2. Schedule maintenance windows.
  3. Review pre-upgrade best practices for SL1.
  4. Back up custom settings in the NextUI.
  5. Back up SSL certificates.
  6. Set the timeout for PhoneHome Watchdog.
  7. Adjust the timeout for slow connections.
  8. Run the system status script on the Database Server or All-In-One before upgrading.
  9. Update the SL1 Distributed Architecture using the System Update tool.
  10. Upgrade MariaDB, if needed.
  11. Reboot SL1 appliances, if needed.
  12. Restore custom settings in the NextUI.
  13. Restore SSL Certificates.
  14. Reset the timeout for PhoneHome Watchdog.
  15. Update default PowerPacks.
  16. Configure Subscription Billing (one time only).

Planning the Update

Before upgrading SL1, perform the following steps that are specific to your organization:

  1. Read the release notes to determine:
  • What is fixed?
  • What is new?
  • What has changed?
  • What has been deprecated?
  1. Read the Known Issues for the release at https://support.sciencelogic.com/s/known-issues.
  2. Identify all integrations and third-party applications that access the SL1 database or manipulate data on SL1. Determine how to disable these integrations during the deployment and re-enable after deployment.
  3. Identify activities and customers that will be affected by maintenance windows and schedule and inform appropriately.
  4. Identify custom work (PowerPacks, Run Book Automations, Event policies, Dashboard widgets) and ensure that it is backed up so you can restore it if necessary.

Ensure that the each SL1 node or appliance has 3 GB of free space in the /var partition to allow you stage the upgrade. Ensure that each SL1 node or appliance has 1 GB of free space in / (the root partition) to allow you to deploy the upgrade.

If you are upgrading from a version of SL1 prior to version 8.10.0, see Updating SL1 Appliances to Oracle Linux.

Scheduling Maintenance Windows

Upgrading SL1 includes a minimum of two and possibly four maintenance windows:

  • Import and stage update and run the pre-upgrade script. These steps can take place prior to the day of upgrade and do not affect SL1 functionality. ScienceLogic suggest you perform these steps at least three days before the planned upgrade and ideally a week before the planned upgrade.
  • Deploy update. On the day of the upgrade, put all SL1 appliances in maintenance mode. The SL1 system will not be available during this procedure. Update the SL1 Distributed systems.
  • Update MariaDB (if required). The SL1 system will not be available during this procedure. Refer to the release notes for your current release to determine if you must upgrade MariaDB.
  • Reboot Appliances (if required). Individual SL1 appliances will not be available during these procedures. Refer to the release notes for your current release to determine if you must reboot all SL1 appliances after upgrading.

Identify activities and users that will be affected by these maintenance windows, and schedule the maintenance windows appropriately. Be sure to communicate all downtime with users.

Pre-Upgrade Best Practices for SL1

Before you upgrade, check the following:

  • Review the hardware specifications of all the appliances in your system to ensure they meet the requirements for the current usage of your system. For more details about sizing and capacity for your specific environment, contact your Customer Success Manager and see https://support.sciencelogic.com/s/system-requirements.
  • Verify that recent backups are available for your system.
  • Ensure that each SL1 appliance has a valid license.
  • Ensure that a Data Collector is a member of a Collector Group if you are applying updates to an existing Data Collector. In some SL1 systems, users might have to create a Collector Group for a single Data Collector.
  • Ensure that each Data Collector is listed as "Available" to the Database Server. To check, see the Collector Status page (System > Monitor > Collector Status).

Backing Up Custom Settings in the NextUI

To save any custom settings in the NextUI:

  1. Log in to the console of the Database Server or SSH to the Database Server.

  2. Open a shell session and type the following at the shell prompt:

    cp /opt/em7/nextui/nextui.env /opt/em7/nextui/nextui.env.backup

Backing Up SSL Certificates

To back up your SSL Certificates:

  1. Log in to the console of the Database Server or SSH to the Database Server.

  2. Open a shell session and type the following commands at the shell prompt:

    cp /etc/nginx/silossl.key /etc/nginx/silossl.key.bak

    cp /etc/nginx/silossl.pem /etc/nginx/silossl.pem.bak

  3. Repeat these steps on each Database Server in your SL1 system.

Setting the Timeout for PhoneHome Watchdog

You can manually adjust the settings for the PhoneHome Watchdog server to reduce CPU consumption during the upgrade process. To do this:

  1. Log in to the console of the Data Collector as the root user or open an SSH session on the Data Collector.
  2. At the command line, type the following:

    phonehome watchdog view
  3. You should see something like the following:

    Current settings:
    autosync: yes
    interval: 20
    state: enabled
    autoreconnect: yes
    timeoutcount: 2
    check: default
  4. Note the settings for interval and timeoutcount, so you can restore them after the upgrade.
  5. To change the settings for SL1 upgrade, type the following at the command line:

    sudo phonehome watchdog set interval=120;
    sudo phonehome watchdog set timeoutcount=2;
    systemctl stop em7_ph_watchdog;
    systemctl start em7_ph_watchdog;
  6. Repeat the steps in this section on each Data Collector.
  7. Repeat the steps in this section on each Message Collector.
  8. Repeat the steps in this section on each Database Server.

Adjusting the Timeout for Slow Connections

If you have slow connections between SL1 appliances, you can adjust the timeout values for staging and deploying upgrades.

To adjust the timeouts:

  1. Log in to the console of the Database Server or SSH to the Database Server.

  2. Open a shell session and type the following at the shell prompt:

    sudo pcli set-patcher-param staging_wait_time <timeout_in_seconds>

    where:

    <timeout_in_seconds> is the timeout value, in seconds, for staging for each SL1 appliance. The default value is 1800 seconds (30 minutes). You can increase this value for slow connections.

  3. Type the following at the shell prompt:

    sudo pcli set-patcher-param deploy_wait_time <timeout_in_seconds>

    where:

    <timeout_in_seconds> is the timeout value, in seconds, for deploying to each SL1 appliance. The default value is 3600 seconds (1 hour). You can increase this value for slow connections.

If you are upgrading from a version of SL1 prior to version 8.14.0, see Adjusting the Timeout for SL1 8.12 and Prior Releases.

Running the System Status Script Before Upgrading

SL1 includes a script, system_status.sh, that provides diagnostic data for each node or appliance in your SL1 system.

On SL1 systems prior to 10.2.0, after running the system status script, you must ensure that the file /var/lob/em7/silo.log has the owner and group "s-em7-core".

The following video explains the features and benefits of the system status script:

Running the System Status Script

If you are running SL1 version SL 8.14.0 or later, SL1 automatically runs the system status script every 15 minutes on each node or appliance in your SL1 system.

ScienceLogic recommends that you view the output from the system status script before upgrading:

  1. In SL1, go to the Appliance Manager page (System > Settings > Appliances).
  2. Locate the SL1 appliance that you want to view diagnostic information about.
  3. Click on its magnifying-glass icon () to view the output of the system status script for that appliance.
  4. If the output includes errors and you need help fixing them, contact ScienceLogic Customer Support to fix the errors before upgrading.
  5. Repeat for each node or appliance in your SL1 system.

To get the very latest status before upgrading, manually run the system status script on each Database Server or All-In-One Appliance.

If you are upgrading from an SL1 8.12.x system, see Running the System Status Script on SL1 8.12.x Releases. If you are upgrading from an SL1 8.10.x or prior release, see Running the System Status Script on SL1 8.10 and Prior Releases.

Updating the SL1 Distributed Architecture

Version 8.12.1.3 introduced delta-less upgrades, which lets you import a single file and upgrade to the latest version. Delta-less upgrades can upgrade the SL1 system from any SL1 release 8.6.0 or later to the current release, using only a single update.

Any SL1 Distributed Architecture system running 8.6.0 or later can be upgraded by importing, staging, and deploying a single update file.

Ensure that the each SL1 appliance has 3 GB of free space in the /var partition to allow you stage the upgrade. Ensure that each SL1 appliance has 1 GB of free space in / (the root partition) to allow you to deploy the upgrade.

Upgrading the SL1 Distributed Stack includes the following steps:

If you are upgrading from an SL1 8.1.1 or later system, see Upgrading the SL1 Distributed Architecture on SL1 Versions 8.5.0 and Earlier.

For systems running an SL1 version prior to 8.12.0, go to the System Updates page and disable automatic staging (System > Tools > Updates > Actions > Disable automatic staging).

If you have previously used manual staging, perform these additional steps:

  1. Select all updates in the EM7 Releases pane and select all updates in the ScienceLogic OS pane.
  2. In the Select Action menu, select Unstage Update (remove staging policy override). Click Go.
  3. For software that was previously staged with automatic staging, Unstage Update (remove staging policy override) does not affect staging.

Downloading the Update

Before you can load a patch or update onto your instance of the SL1 system, you must first download the patch or update to your local computer.

The following steps do not affect the performance of the SL1 system. ScienceLogic recommends that you perform these steps at least three days before upgrading.

To download the patch or update:

  1. Log in to https://support.sciencelogic.com. Use your ScienceLogic customer account and password to access this site.
  2. From the Product Downloads menu, select Platform. The Platform Downloads page appears.
  3. Find the release you are interested in and click its name. The Release Version page appears.
  4. Click the specific link for a release, if needed.
  5. Click the link for the release image or release patch you want to download, and click the Download File button. The file is then downloaded to your local computer.

Importing the Update

To import a product update on to your SL1 system:

  1. In the SL1 system, go to the System Updates page (System > Tools > Updates).
  2. In the System Updates page, click the Import button.
  3. In the Import a new update modal page, browse to the product update file and select it.
  • If you select the Auto Stage button, the SL1 system will begin staging as soon as the import is completed.
  • If you do not select the Auto Stage button, you must click the staging button() after import is completed. You can do so at any time after import has completed.
  • For more information on automatic staging and manual staging, see the section on Staging.
  1. Click the Import button.
  2. In the System Updates page, the Import Status column can have one of the following statuses:
  • In Progress. Software is currently being imported by the SL1 system.
  • Complete. Software has been imported successfully.
  • Failed. Software import has failed due to an unexpected condition. Contact ScienceLogic Support for assistance.
  • Missing Base. The SL1 system cannot import this software until another software package has been imported. The dependency is for compression purposes. Check the log for a message stating which software package needs to be imported.
  1. The update file or patch file is imported to SL1 system and appears in the System Updates page.

NOTE: For details on the import process, go to the System Updates page, find the entry for the software you are interested in, go to its Import Status column, and click the log icon ().

Staging the Update

After you import a software update to your SL1 system, you must stage the software update. During staging, the SL1 system copies the software update to each SL1 appliance. Staging allows SL1 to simultaneously apply the software changes to each SL1 appliance, regardless of the speed of the connection to each SL1 appliance. The SL1 system stages updates per import. You can choose to automatically stage imports or manually stage import.

For easiest troubleshooting, ScienceLogic recommends that you manually stage imports.

The Staging Status column on the the System Updates page can have one of the following statuses:

  • --. No staging request is active and software has not been staged on any SL1 appliances.
  • Scheduled. The SL1 system is aware of the staging request and is preparing for staging.
  • In Progress. Staging is in progress but has not completed. The page displays the percentage complete as staging progresses.
  • Complete. Staging has completed, and all appliances are ready to deploy the software.
  • Incomplete. Staging has completed, and one or more appliances are ready to deploy the software.
  • Canceled. User manually canceled the staging process.
  • Outdated. The current update is not the latest or has already been installed.
  • Failed. An unexpected error occurred in the staging process. Contact ScienceLogic Support.

For details on the staging process, go to the System Updates page, find the entry for the software you are interested in, go to its Staging Status column, and click the log icon ().

After the software update is imported and staged, you can deploy the software.

Automatic Staging

To enable automatic staging:

  1. In SL1, go to the System Updates page (System > Tools > Updates).
  2. In the System Updates page, click the Import button.
  3. In the Import a new update modal page, browse to the product update file and select it. If you select the Auto Stage button, the SL1 system will begin staging as soon as the import is completed.
  1. After import, in the System Updates page, the Staging Status column will display the number of ScienceLogic appliances that have been successfully stage compared to the total number of ScienceLogic appliances

To disable automatic staging:

  1. In SL1, go to the System Updates page (System > Tools > Updates).
  2. In the System Updates page, click the Import button.
  3. In the Import a new update modal page, browse to the product update file and select it.
  4. If you do not select the Auto Stage button, you must click the staging button () after import is completed. You can do so at any time after import has completed.

Manually Staging an Update

You can manually stage a software update:

  • If you imported an update but do not want to stage it immediately.
  • If you add another ScienceLogic appliance to your SL1 system and need to apply software updates.
  • If staging failed on one or more ScienceLogic appliances.
  • If you want to ensure that a previous staging process was successful.

When you manually stage a software update, SL1 checks the status of the software updated on each ScienceLogic appliance. SL1 then stages the software update only to those SL1 appliances that have not yet been staged for this software update.

To manually stage a software update:

  1. In SL1, go to the System Updates page (System > Tools > Updates).
  2. Locate the software update you want to stage and click its staging icon (). The software update will be copied to each ScienceLogic appliance that has not yet been staged.
  3. The Staging Status column will display the number of ScienceLogic appliances that have been successfully stage compared to the total number of ScienceLogic appliances.

Monitoring Staging

For SL1 versions 8.12.0 and later, you can monitor the staging process:

  1. Either go to the console of the Database Server or use SSH to access the Database Server.

  2. Type the following at the shell prompt:

    monitor_stage

    You should see something like the following image:

  3. In the monitor_stage results, look for the following information:

  • System Update Vitals. Displays the current status of the services that are required for System Update.
  • Staging Process Stats. Displays status of staging on all SL1 appliances.

Running the Pre-Upgrade Check

After importing and staging an update, you can run a pre-upgrade check before deploying. The pre-upgrade check will ensure that all criteria are met before deploying.

If you are upgrading from SL1 8.14 or earlier, see Running the Pre-Upgrade Check for SL1 8.14 or Earlier.

The pre-upgrade check examines the following:

  • Is each SL1 Appliance eligible to be updated?
  • Are updates enable on each SL1 Appliance?
  • Are any of the SL1 Appliances running CentOS 5?
  • Is this hostfile on each SL1 Appliance correctly configured?
  • Is each Data Collector and Message Collector in a Collector Group?
  • Is there enough free space on the disk​ to perform the upgrade?
  • Is the RPM database corrupted?
  • Are the RPM packages corrupted?
  • Does the patch hook directory have the correct owner assigned?
  • Detect out-of-date CRM templates on High Availability and Disaster Recovery systems
  • If /etc/init.d/mysql does not exist, creates the file
  • Skips SL1 appliances that have been deleted since the last upgrade

Running the Pre-Upgrade Check

To run a pre-upgrade check:

  1. Go to the System Updates page (System > Tools > Updates.
  2. Find the upgrade that you want to deploy.
  3. Click the purple checkmark at the end of the row. The pre-upgrade check will run.
  4. If a pre-upgrade criterion fails, the Deploy button will be disabled for the selected row.
  5. To view the output from the pre-upgrade check, click on the on the magnifying-glass icon () in the selected row.
  6. If the pre-upgrade check finds a failure, see the list below for possible causes.
  7. Fix all failures before deploying the update.

Potential Issues to Address

CentOS 5 Failure

CentOS 5 is no longer supported by System Update. If one or more Data Collectors are running CentOS5, the pre-upgrade check will fail. Contact your Customer Success Manager to determine how to upgrade your Data Collectors.

Collector Group Membership

This test checks that each Data Collector and Message Collector is a member of a Collector Group.

If a Data Collector or Message Collector is not a member of a Collector Group, the pre-upgrade test will define the appliance as "not eligible for patching.

To fix this error, add the Data Collector or Message Collector to a Collector Group.

Eligibility Failure

The most common reasons for eligibility failure are:

  • The SL1 appliance is not licensed or the license has expired
  • The SL1 appliance cannot be reached over the network
  • The Data Collector has failed over
  • The SL1 appliance is not configured
  • The Data Collector is waiting to be returned to service
  • The Data Collector is not assigned to a Collector Group

Enabled Failure

By default, all SL1 appliances are enabled for patching.

However, if you have used a command-line tool to exclude an SL1 appliance from updates, the pre-upgrade check will fail. To fix this error, include the SL1 appliance for updates.

Free Disk-Space Failure

This test checks the root partition and requires 1GB of free disk space. If the root partition does not have 1GB of free disk space, the pre-upgrade check will fail.

If the root partition does not have 1GB of free disk space, you must archive or delete files that are no longer required or add a new empty disk and resize the filesystem.

Host File Failure

This test validates the /etc/hosts file for the presence of an IPv6 entry for localhost, which is required by System Update.

If /etc/hosts does not include an IPv6 entry for localhost, the pre-upgrade test automatically adds the required entry.

Check the following in case of failure:

  • The /etc/hosts file exists
  • The /etc/hosts can be edited by root

Patch-Hook Ownership Failure

If the owner of the patch hook directory ((/var/lib/em7/patch_hook) is incorrect, the pre-upgrade test automatically fixes the ownership. However, if this error occurs, check for the following:

  • The patch hook directory (/var/lib/em7/patch_hook) does not exist
  • The s-em7-core user or the s-em7-core group does not exist

RPM Database Failure

If the RPM database fails the pre-upgrade test, the RPM database is corrupted.

To recover the RPM database:

  1. Either go to the console of the Database Server or use SSH to access the Database Server. Log in with the credentials you defined when you installed the Database Server.
  2. At the shell prompt, enter the following:

    mkdir -p /tmp/rpm.bak
    cp /var/lib/rpm/* /tmp/rpm.bak
    rm -f /var/lib/rpm/__db*
    rpm --rebuilddb -vv
    rpm -q kernel
    
  1. If the last command returns a value, you can delete the backup directory using the following command.

    rm -Rf /tmp/rpm.bak

RPM Package Failure

If one or more RPM packages failed the pre-upgrade test, possible causes are:

  • Packages are not staged, and hence some files are missing. This can be caused due to a failed staging or a timeout during staging. You can try to stage again. You can also adjust the timeout for staging.
  • Duplicate packages
  • Conflicting packages
  • Unmet dependencies

Duplicate Packages:

  1. Either go to the console of the Database Server or use SSH to access the Database Server. Log in with the credentials you defined when you installed the Database Server.
  2. At the shell prompt, enter the following command:

    sudo package-cleanup --dupes

  3. If there are duplicate packages, use the following command to remove them:

    sudo package-cleanup --cleandupes --removenewestdupes

Conflicting Packages

  1. Look for conflicting packages in the staging log
  2. Verify that the package is a part of SL1 ISO or patch bundle
  3. If the package is not part of the SL1 ISO or patch bundle, uninstall the package.

Unmet dependencies

You will need to reset the staging status of the appliance and stage it again. Contact ScienceLogic Customer Success for help in resetting the staging status.

Putting All SL1 Appliances into Maintenance Mode

ScienceLogic recommends that you perform these steps during a maintenance window.

Immediately before deploying a software update, ScienceLogic recommends that you put all SL1 appliances in maintenance mode. This will prevent spurious error messages and events during the deployment.

To enable user maintenance mode for all the SL1 appliances in your SL1 system:

  1. Go to the Appliance Manager page (System > Settings > Appliances). Note the list of SL1 appliances in your system.
  2. Go to the Device Manager page (Registry > Devices > Device Manager) and select the checkbox for each SL1 appliance in your SL1 system. This includes both primary and secondary Database Servers.
  3. In the Select Action drop-down list, select Change User Maintenance Mode: Enabled without Collection. This option puts the selected devices into user maintenance mode with collection disabled. The devices will remain in this state until you or another user disables user maintenance mode.
  4. Click the Go button.

Deploying the Update

During deployment, avoid the following tasks:

  • Running integrations and third-party applications that access the SL1 database or manipulate data on SL1
  • Running discovery sessions
  • Running nightly discovery
  • Bringing HA/DR out of maintenance mode
  • Adding new SL1 Appliances
  • Importing a new patch
  • Adding Data Collectors to a Collector Group
  • Removing Data Collectors from a Collector Group
  • Rebalancing a Collector Group
  • Killing processes related to patching and upgrading
  • Run reporting jobs
  • Unpausing the proc_mgr process

When you deploy an update, the update is installed on all nodes or appliances that have already been staged.

When you deploy an update, SL1 checks to ensure that you have already deployed all required updates. If you have not, SL1 will generate an error message specifying the updates you must deploy before continuing with the current update.

During deployment, the Deployment Status column on the System Updates page can have one of the following statuses:

  • --. No deployment request is active, and software has not been deployed on any SL1 appliances.
  • Scheduled. The SL1 system is aware of the deployment request and is preparing for deployment.
  • In Progress. Deployment is in progress but has not completed.
  • Complete. Deployment has completed, and all appliances are updated.
  • Incomplete. Deployment has completed, and one or more, but not all, appliances are updated.
  • Canceled. User manually canceled the deployment.
  • Outdated. The current update is not the latest or has already been installed.
  • Failed. An unexpected error occurred in the deployment process. Contact ScienceLogic Support.

To deploy a software update on your nodes or appliances:

  1. Make sure that you have imported and staged the update file.
  2. Go to the System Updates page (System > Tools > Updates).
  3. In the System Updates page, find the software update you want to deploy. Click the lightning bolt icon () to deploy the software. If SL1 is still staging the patch when you click the lightning-bolt icon (), SL1 will wait until staging has completed before deploying the updates to each ScienceLogic appliance.
  4. The software update will be deployed to all appliances in your SL1 system that have already been staged. If one or more appliances in your SL1 system have been successfully staged, SL1 will deploy the update to those appliances.

For details on the deployment process, go to the System Updates page, find the entry for the software you are interested in, go to its Deployment Status column, and click the log icon ().

Troubleshooting System Update

You can use the sysuptb troubleshooting tool to determine issues with System Update and to generate diagnostic information about the update. You can also use the phtb tool to troubleshoot issue with the PhoneHome configuration.

These tools can be useful when System Update does not work as expected, or if you have issues with the PhoneHome configuration or with communication between appliances and the Database Server. These tools are available on all SL1 appliances starting with SL1 version 10.2.0, and the tools are backwards-compatible to SL1 version 8.12.0.

Using the sysuptb Troubleshooting Tool

To use the sysuptb troubleshooting tool:

  1. Either go to the console of any SL1 appliance or use SSH to access the appliance
  2. Enter the following at the shell prompt:

    sudo sysuptb -h

  3. For more information about each argument, enter the following at the shell prompt:

    sudo sysuptb <argument> -h

Available Commands

  • The following command executes all troubleshooting tests for System Update:

    sudo sysuptb all <optionally -x name_of_test_to_exclude>

    To learn more about a test run, use this command: sudo sysuptb help <test-name>

  • Example:

    sudo sysuptb all
    Executing filestore tests
    912 / 912 [-----------------------------------------------------] 100.00% 14 p/s
    Filestore test summary: [Total: 912, Intact: 912, Incomplete: 0, Corrupt: 0]
    Executing test for deleted appliances in patch history
    No deleted appliances were found in the patch history
    Executing test for invalid file id in patch schedules
    No patch schedules were found to have invalid file id
    Executing test for RPM database corruption
    RPM database is intact
    Executing test to check if filestore is empty
    Filestore has 1026 files
    Executing test for deactivating services
    Service test summary: [Total: 1, Active: 1, Inactive: 0, Healed: 0, Skipped: 0, Failed (to heal): 0]
    Executing test for free disk space
    Free disk space test summary: [Total: 2, Pass: 2, Failed: 0]
    Executing test for service errors
    Service error test summary: [Total: 2, Without Errors: 2, Restarted: 0, Failed: 0]
    Executing hosts file check for IPV6 entry (::1) for localhost
    An entry for ::1 is already present in the hosts file
    Proxy is not configured for yum.
    Executing test for hung yum process
    No yum processes found
    Yum process summary: [Total: 0, Hung: 0]
  • The following command searches the logs for errors that match a service name and restarts services if any errors are found.

    sudo sysuptb check-service-error <optionally -s name_of_service>

    If you do not provide the name of a service, the command searches the logs for errors for siloupdate-pkgserver.service and silopupdate-spool.service.

  • Example:

    sudo sysuptb check-service-error
    Executing test for service errors
    Service error test summary: [Total: 2, Without Errors: 2, Restarted: 0, Failed: 0]
  • The following command removes deleted SL1 appliances from the history of system updates so that they SL1 does not search for them during update.

    sudo sysuptb clear-mids
  • Example:

    sudo sysuptb clear-mids
    Executing test for deleted appliances in patch history
    No deleted appliances were found in the patch history
    
  • The following command cancels al schedule updates that include an invalid ID for the patch file.

    sudo sysuptb clear-schedule
  • Example:

    sudo sysuptb clear-schedule
    Executing test for invalid file id in patch schedules
    No patch schedules were found to have invalid file id
  • The following command checks the filestore of downloaded packages for corrupt files and marks the corrupt files as incomplete.

    sudo sysuptb filestore

  • Example:

    sudo sysuptb filestore
    Executing filestore tests
    912 / 912 [-----------------------------------------------------------] 100.00% 14 p/s
    Filestore test summary: [Total: 912, Intact: 912, Incomplete: 0, Corrupt: 0]
  • The following command checks the file system for available free space.

    sudo sysuptb free-space <optionally, -d path_for_drive = minimum_size> 

    If you do not provide the path and minimum size of the directory, the command examines /var to make sure it has 300MB of free space and / to make sure it has 1GB of free space.

  • Example:

    sudo sysuptb free-space --disk /var=300MB
    Executing test for free disk space
    Free disk space test summary: [Total: 1, Pass: 1, Failed: 0]
  • The command checks for update services that are stuck in a deactivating state and then heals them.

    sudo sysuptb heal-service <optionally -s service_name>

    If you do not specify a service, the command examines the service em7_patch_manager.service.

  • Example:

    sudo sysuptb heal-service

Executing test for deactivating services

Service test summary: [Total: 1, Active: 1, Inactive: 0, Healed: 0, Skipped: 0, Failed (to heal): 0]

  • The following command checks the /etc/hosts file for an entry for IPv6 for the current server (like a loopback address). If no entry exists, the command adds ::1 to the /etc/hosts file.

    sudo sysuptb hosts
  • Example:

    sudo sysuptb hosts
    Executing hosts file check for IPV6 entry (::1) for localhost
    An entry for ::1 is already present in the hosts file
  • The following command check is the filestore that holds the upgrade packages is empty.

    sudo sysuptb is-filestore-empty

  • Example:

    sudo sysuptb is-filestore-empty
    Executing test to check if filestore is empty
    Filestore has 1026 files
  • The following command checks the RPM database on /var/lib/rpm for corruption. If the command detects corruption, the output includes steps for remediation.

    sudo sysuptb rpmdb

  • Example:

    sudo sysuptb rpmdb
    Executing test for RPM database corruption
    RPM database is intact
  • The following command checks for a yum process which is hung.

    sudo sysuptb yum-proc <optionally, -t timeout_in_minutes>

    If you do not specify a running time, in minutes, the command searches for yum processes that have been running for more than 120 minutes.

  • Example:

    sudo sysuptb yum-proc
    Executing test for hung yum process
    No yum processes found
    Yum process summary: [Total: 0, Hung: 0]
  • The following command checks if yum is configured with proxy. If so, the command removes the proxy configuration.

    sudo sysuptb yum-proxy
  • Example:

    sudo sysuptb yum-proxy

    Proxy is not configured for yum.

Using the phtb Troubleshooting Tool

To use the phtb troubleshooting tool:

  1. Either go to the console of an SL1 appliance using PhoneHome communication or use SSH to access the appliance.

  2. Enter the following at the shell prompt:

    sudo phtb -h

  3. For more information about each argument, enter the following at the shell prompt:

    sudo phtb <argument> -h

    To learn more about a test run, use this command: sudo phtb help <test-name>

Available Commands

  • The following command checks destinations for SSH connectivity issues:

    sudo phtb destination
  • The following command checks the target host for SSH connectivity issues:

    sudo phtb probe-host
  • The following command checks connectivity to the proxy host, if configured:

    sudo phtb proxy

Monitoring Deployment

For SL1 versions 8.12.0 and later, you can monitor deployment. To do so:

  1. Either go to the console of the Database Server or use SSH to access the Database Server.
  2. Enter the following command at the shell prompt:

    monitor_deploy
  3. You should see something like the following figure:

  • System Update Vitals. Displays the current status of the services that are required for System Update.
  • Deployment Process Stats. Displays status of deployment on all SL1 appliances.
  • Module Level Status. Displays the status of the three deployment steps.

Remove SL1 Appliances from Maintenance Mode

To disable user maintenance mode for all the SL1 appliances in your SL1 system:

  1. Go to the Appliance Manager page. Note the list of SL1 appliances in your system.
  2. Go to the Device Manager page (Registry > Devices > Device Manager) and select the checkbox for each SL1 appliance in your SL1 system.
  3. In the Select Action drop-down list, select Change User Maintenance Mode: Disabled. This option disables user maintenance mode for the selected devices.
  4. Click the Go button.

Refer to the release notes for your current release to determine if you must upgrade MariaDB after upgrading.

Also refer to the release notes for your current release to determine if you must reboot all SL1 appliances after upgrading.

Updating SL1 Extended Architecture

As of January 1, 2021, new installations of SL1 Extended Architecture are available only on SaaS deployments.

For existing on-premises deployments of SL1 Extended Architecture, please contact ScienceLogic Customer Support for upgrade documentation and help with technical issues.

Automatically Upgrading MariaDB with a Script

To reduce spurious events, you can put the Database Server in maintenance mode while you upgrade MariaDB. For details, see the chapter on Putting the Database Server into Maintenance Mode

Refer to the release notes for your current release to determine if you must upgrade MariaDB. Not every SL1 update requires an upgrade of for MariaDB.

SL1 will automatically update MariaDB-client, MariaDB-common, and MariaDB-shared RPMs but will not update the MariaDB Server RPM. You must update the MariaDB Server RPM after you install the SL1 update.

SL1 10.1.0 and later releases include the module_upgrade_mariadb script to automatically upgrade MariaDB server.

You should store all custom configuration settings for each MariaDB database in the file /etc/siteconfig/mysql.siteconfig. If you have added custom settings to the file /etc/my.cnf.d/silo_mysql.cnf, those changes will be overwritten each time you upgrade MariaDB. Before upgrading, copy any custom settings to the file /etc/siteconfig/mysql.siteconfig. SL1 will save these custom settings and apply them after you upgrade MariaDB.

The module_upgrade_mariadb script:

  • Upgrades the following SL1 appliances:
  • All Database Servers
  • All-In-One Appliances
  • Data Collectors
  • Message Collectors
  • Upgrades High Availability (HA) and Disaster Recovery (DR) systems
  • Includes a "test only" option before executing upgrade
  • Enforces upgrading the primary Database Server before upgrading secondary Database Server and the Data Collectors.
  • Will skip SL1 appliances that have already been updated
  • Logs entire sequence of commands and output for later analysis
  • Stores log files in /data/logs/module_upgrade_mariadb.log and /data/logs/.upgrade_mariadb.log
  • Checks for differences between current configuration and version you are about to install and spawns an alert. To skip this check, use the -s -s option

To upgrade MariaDB, perform the following:

  1. Either go to the console of the Database Server or use SSH to access the Database Server.
  2. At the shell prompt, enter the following command:

    sudo /opt/em7/bin/module_upgrade_mariadb  -m all
  3. To see all the options for the module_upgrade_mariadb script, enter the following command at the shell prompt:

    /opt/em7/bin/module_upgrade_mariadb -h

    Usage:

    module_upgrade_mariadb -m <module_id> [-t|--test] [-y|--assumeyes] [-s|--skip_conf_file_error][-p|--pool size <number_of_modules>][-h|--help]

  4. The script includes these options:
    • -m parameter specifies the SL1 appliances that you want to upgrade. You can specify:
    • -m <mid1, mid2…midN> provides a comma-separated module IDs.
    • -m all : upgrade all appliances (Database Servers, All-In-One Appliances, Data Collectors, and Message Collectors).
    • -m all -db : upgrade all Database Servers.
    • -m all-cu : upgrade all Data Collectors and Message Collectors.
    • -t parameter specifies not to upgrade but instead to run a test of the upgrade script.
    • -y parameter specifies to automatically enter "yes" at all prompts.
    • -s parameter specifies to ignore errors in the MySQL configuration files and proceed with the upgrade.
    • -p parameter specifies the number of Data Collectors that you want to upgrade simultaneously. Database Servers will be upgraded one at a time. Possible values are 1 - 20. The default value is 1.
    • -p <number_of_modules> is the number of Data Collectors to upgrade simultaneously. Values are 1 - 20. The default value is 1.
  5. To view the status of the automatic upgrade, enter the following command:

    monitor_upgrade_mariadb

Additional Steps for MariaDB Upgrades in 10.1.x

SL1 10.1.x included an upgrade to MariaDB. The upgrade did not include a tool, jemalloc, that helps manage memory usage.

This section applies only to the following releases:

  • 10.1.0
  • 10.1.1
  • 10.1.2
  • 10.1.3
  • 10.1.4
  • 10.1.4.1
  • 10.1.4.2
  • 10.1.5
  • 10.1.5.1

For SL1 versions later than 10.1.5.1, jemalloc is included with the platform. For SL1 versions prior to 10.1.0, jemalloc is included with the platform.

To avoid problems with memory usage on Database Servers, perform the following steps after upgrading MariaDB for 10.1.x.

Perform these steps first on the active Database Server and then on each additional Database Server in your SL1 System.

  1. Open an SSH session to the Database Server.
  2. To verify that the Database Server is not currently running jemalloc, enter the following command at the shell prompt:

    silo_mysql -e 'show global variables like "version_malloc_library"'
  3. If the Database Server is not currently running jemalloc, the shell will display the following:

    Variable Name Value
    version_malloc_library system
  4. Search for the file /usr/lib64/libjemalloc.so.1.

    If the file does not exist, contact ScienceLogic Customer Support to request the file jemalloc-3.6.0-1.el7.x86_64.rpm.

    To install the RPM, use a file-transfer utility, copy the file to a directory on the SL1 appliance. Then enter the following commands at the shell prompt:

    cd /usr/lib64
    sudo yum install jemalloc-3.6.0-1.el7.x86_64.rpm
  5. Create the file /etc/systemd/system/mariadb.service.d/jemalloc.conf, as follows:

    vi /etc/systemd/system/mariadb.service.d/jemalloc.conf
  6. Add the following lines to the file:

    [Service]
    Environment="LD_PRELOAD=/usr/lib64/libjemalloc.so.1"
  7. Save and close the file.
  8. Reload the systemd config files with the following command:

    sudo systemctl daemon-reload
  9. Restart the Database Server:

    To restart the standalone Database Server or the primary Database Server in a cluster, enter the following:

    sudo systemctl restart mariadb

    To restart each secondary Database Server in a cluster:

    1. Open an SSH session to the secondary Database Server. At the shell prompt, enter:

      coro_config
    2. Select 1.
    3. When prompted to put the Database Server into maintenance, select y.
    4. Open an SSH session to the primary Database Server. To pause SL1, enter the following command at the shell prompt:

      sudo touch /etc/.proc_mgr_pause
    5. In the SSH session for the secondary Database Server, restart MariaDB:

      crm resource restart mysql
    6. After MariaDB has restarted successfully on the secondary Database Server, return to the SSH session on the primary Database Server. Remove the pause file for SL1 using the following command:

      sudo rm /etc/.proc_mgr_pause
    7. In the SSH session on the secondary Database Server, take the Database Server out of maintenance. At the shell prompt, enter:

      coro_config
    8. Select 1.
    9. When prompted to take the Database Server out of maintenance, select y.
  10. To verify that jemalloc is running on the Database Server, enter the following command at the shell prompt:

    silo_mysql -e 'show global variables like "version_malloc_library"'
  11. If the Database Server is currently running jemalloc, the shell will display something like the following:

    Variable Name Value
    version_malloc_library jemalloc 3.6.0-0-g46c0af68bd248b04df75e4f92d5fb804c3d75340
  12. Perform these steps on each Database Server in your SL1 system.

Manually Upgrading MariaDB

Refer to the release notes for your current release to determine if you must upgrade MariaDB. Not every SL1 update requires an upgrade of MariaDB.

ScienceLogic strongly recommends that you upgrade MariaDB using the script described in Automatically Upgrading MariaDB with a Script.

To reduce spurious events, you can put the Database Server in maintenance mode while you upgrade MariaDB. For details, see the chapter on Putting the Database Server into Maintenance Mode

If you prefer to upgrade MariaDB manually, the following sections describe how to upgrade the MariaDB server for different SL1 appliance types and architectures.

When you update MariaDB, you must update the following SL1 appliances:

  • All Database Servers
  • All-In-One Appliances
  • Data Collectors
  • Message Collectors

You should store all custom configuration settings for each MariaDB database in the file /etc/siteconfig/mysql.siteconfig. If you have added custom settings to the file /etc/my.cnf.d/silo_mysql.cnf, those changes will be overwritten each time you upgrade MariaDB. Before upgrading, copy any custom settings to the file /etc/siteconfig/mysql.siteconfig. SL1 will save these custom settings and apply them after you upgrade MariaDB.

Download RPMs to SL1 Appliances

Before upgrading MariaDB, you must copy the RPMs from the primary Database Server to the Database Servers, All-In-One Appliances, Data Collectors, and Message Collectors in your SL1 system. To do this.

  1. Open an SSH session to the Database Server.
  2. To download the latest RPMs from the Database Server, enter the following at the shell prompt:

    For SL1 version 10.1.0 to 10.1.5:

    wget --output-document /tmp/MariaDB-server-10.4.12-1.el7.centos.x86_64.rpm http://localhost:10080/MariaDB-server.rpm
    wget --output-document /tmp/galera-4-26.4.3-1.rhel7.el7.centos.x86_64.rpm http://localhost:10080/galera-4.rpm

    For SL1 version 10.1.6 and higher, download all of the packages listed when you enter the command:

    cat /opt/em7/share/db_packages
    wget --output-document /tmp/MariaDB-server.rpm http://localhost:10080/<mariadb-server-pkg-from-db_packages> 
    wget --output-document /tmp/galera-4.rpm http://localhost:10080/<galera-4-pkg-from-db_packages>
    wget --output-document /tmp/socat.rpm http://localhost:10080/<socat-pkg-from-db_packages> 
  3. Verify if the downloaded packages are valid (not corrupt or incomplete downloads) by entering the following commands:

    rpm -qip /tmp/MariaDB-server.rpm
    rpm -qip /tmp/galera-4.rpm
    rpm -qip /tmp/socat.rpm 

    If any errors are reported, try restarting siloupdate-pkgserver, using the following command, and retry downloading and verifying the RPM files again.

    systemctl restart siloupdate-pkgserver
  4. Use SCP or another secure copy program to copy these files to the /tmp directory on each Database Server, All-In-One Appliance, Data Collector, and Message Collector:
    • MariaDDB-server.rpm
    • galera-4.rpm

    To conserve disk space, ScienceLogic recommends you delete the RPMs from the /tmp directory the on the Database Servers, All-In-One Appliances, Data Collectors, and Message Collectors in your SL1 system after you successfully upgrade MariaDB.

Manually Upgrade Two Database Servers Configured for High Availability or Disaster Recovery

To upgrade a High Availability or Disaster Recovery cluster, perform the following steps:

The system will be unavailable when performing these steps.

Step 1: On the Secondary Database Server

You must put the secondary Database Server in maintenance mode. To do this:

  1. Open an SSH session to the Database Server.
  2. At the shell prompt, assume root privileges:

    sudo -s
  3. When prompted, enter the administrator password.
  4. At the shell prompt, enter the following command:

    coro config

    The following menu appears:

    1) Enable Maintenance
    2) Option Disabled
    3) Promote DRBD
    4) Stop Pacemaker
    5) Resource Status
    6) Quit
  5. Enter "1".

Step 2: On the Primary Database Server

  1. To determine the current installed version of the RPMs, enter the following command:

    sudo rpm -qa ^MariaDB-*
  2. To stop SL1 and MariaDB, enter the following commands at the shell prompt:

    sudo systemctl stop em7
    sudo systemctl stop mariadb.service
  3. To stop the MySQL resource, enter the following command:

    sudo crm resource stop mysql
  4. To save the current enabled state for mariadb.service, enter the following command:

    export MSRV=`sudo systemctl is-enabled mariadb.service`
  5. Check the version of MariaDB-server that you are running.

    rpm -q MariaDB-server
  6. You must follow the steps below that correspond to your version of MariaDB. Step 6 is specific to MariaDB-server version 10.1.x, while Step 7 is specific to MariaDB-server versions 10.4.12 and higher.

  7. MariaDB-server version 10.1.x. If you are running MariaDB-server version 10.1.x:
  8. Do these steps in order. Doing the steps in any other order will result in unintended consequences.

    1. Remove MariaDB-server by using the following commands:

      sudo rpm --nodeps -ev MariaDB-server
    2. Replace the Galera package and install the new MariaDB-server package by using the following commands:

      sudo yum --disablerepo=* swap -- remove galera -- install /tmp/galera-4.rpm
      sudo yum --disablerepo=* install /tmp/MariaDB-server.rpm
  9. MariaDB-server version 10.4.12 and higher. If you are running MariaDB-server version 10.4.12 or higher, upgrade the MariaDB-server package and dependent packages (galera-4 and socat) by using the following commands:

    sudo yum --disablerepo=* install /tmp/galera-4.rpm /tmp/socat.rpm
    sudo yum --disablerepo=* upgrade /tmp/MariaDB-server.rpm
  10. To remove incompatible backup packages, enter the following command:

    sudo yum remove percona-xtrabackup
  11. NOTE: If the "yum remove" command fails, it means that the package does not exist on the SL1 appliance. You can ignore the error message.

  12. To regenerate the configuration file for MariaDB, enter the following command:

    sudo /opt/em7/share/scripts/generate-my-conf.py -f -o /etc/my.cnf.d/silo_mysql.cnf
  13. To re-start MariaDB, enter the following command:

    sudo systemctl daemon-reload
    sudo systemctl start mariadb
  14. To restart the MySQL resource, enter the following command:

    sudo crm resource start mysql
  15. To restore the mariadb.service enabled state, enter the following command:

    sudo systemctl ${MSRV::-1} mariadb.service
  16. To upgrade the internal configuration for the database, enter the following:

    sudo mysql_upgrade -u root -p
  17. To restart the em7 service, enter the following commands:

    sudo systemctl start em7
    sudo rpm -qa ^MariaDB-*

Step 3: On the Secondary Database Server

  1. Determine the current installed version of the RPMs using the following command:

    sudo rpm -qa ^MariaDB-*
  2. To save the current enabled state for mariadb.service, enter the following:

    export MSRV=`sudo systemctl is-enabled mariadb.service`
  3. Check the version of MariaDB-server that you are running.

    rpm -q MariaDB-server
  4. You must follow the steps that correspond to your version of MariaDB. Step 4 is specific to MariaDB-server version 10.1.x, while Step 5 is specific to MariaDB-server versions 10.4.12 and higher.

  5. MariaDB-server 10.1.x: If you are running MariaDB-server version 10.1.x:
  6. Do these steps in order. Doing the steps in any other order will result in unintended consequences.

    1. Remove MariaDB-server by using the following command:

      sudo rpm --nodeps -ev MariaDB-server
    2. Replace the Galera package and install the new MariaDB-server package by using the following commands:

      sudo yum --disablerepo=* swap -- remove galera -- install /tmp/galera-4.rpm
      sudo yum --disablerepo=* install /tmp/MariaDB-server.rpm
  7. MariaDB-server 10.4.12 or higher: If you are running MariaDB-server version 10.4.12 or higher, upgrade the MariaDB-server package and dependent packages (galera-4 and socat)by using the following commands:

    sudo yum --disablerepo=* install /tmp/galera-4.rpm /tmp/socat.rpm
    sudo yum --disablerepo=* upgrade /tmp/MariaDB-server.rpm
  8. To remove incompatible backup packages, enter the following command:

    sudo yum remove percona-xtrabackup
  9. NOTE: If the "yum remove" command fails, it means that the package does not exist on the SL1 appliance. You can ignore the error message.

  10. To regenerate the configuration file for MariaDB, enter the following command:

    sudo /opt/em7/share/scripts/generate-my-conf.py -f -o /etc/my.cnf.d/silo_mysql.cnf
  11. To restore the mariadb.service enabled state, enter the following command:

    sudo systemctl ${MSRV::-1} mariadb.service
  12. To take the secondary Database Server out of maintenance mode, enter the following command at the shell prompt:

    sudo -s
  13. When prompted, enter the administrator password.
  14. At the shell prompt, enter the following command:

    coro config

    The following prompt appears:

    
    1) Disable Maintenance
    2) Option Disabled
    3) Promote DRBD
    4) Stop Pacemaker
    5) Resource Status
    6) Quit
  15. Enter "1".

Manually Upgrade Three Database Servers Configured for High Availability and Disaster Recovery

To upgrade a High Availability/Disaster Recovery cluster, perform the following steps:

The system will be unavailable when performing these steps.

Step 1: On the Secondary Database Server

You must put the secondary Database Server in maintenance mode. To do this:

  1. Open an SSH session to the Database Server.
  2. At the shell prompt, assume root privileges:

    sudo -s
  3. When prompted, enter the administrator password.
  4. At the shell prompt, enter the following command:

    coro config
  5. The following prompt appears:

    1) Disable Maintenance
    2) Option Disabled
    3) Promote DRBD
    4) Stop Pacemaker
    5) Resource Status
    6) Quit
  6. Enter "1".

Step 2: On the Primary Database Server

  1. Determine the current installed version of the RPMs:

    sudo rpm -qa ^MariaDB-*
  2. Stop SL1 and MariaDB using the following commands:

    sudo systemctl stop em7
    sudo systemctl stop mariadb.service
  3. Stop the MySQL resource:

    sudo crm resource stop mysql
  4. Save the current enabled state for the mariadb.service:

    export MSRV=`sudo systemctl is-enabled mariadb.service`
  5. Check the version of MariaDB-server that you are running:

    rpm -q MariaDB-server
  6. You must follow the steps that correspond to your version of MariaDB.

  7. MariaDB-server version 10.1.x. If you are running MariaDB-server version 10.1.x:
  8. Do these steps in order. Doing the steps in any other order will result in unintended consequences.

    1. Remove MariaDB-server by using the following command:

      sudo rpm --nodeps -ev MariaDB-server
    2. Replace the Galera package and install the new MariaDB-server package by using the following commands:

      sudo yum --disablerepo=* swap -- remove galera -- install /tmp/galera-4.rpm
      sudo yum --disablerepo=* install /tmp/MariaDB-server.rpm
  9. MariaDB-server version 10.4.12 and higher. If you are running MariaDB-version 10.4.12 or higher, upgrade the MariaDB-server package and dependent packages (galera-4 and socat) by using the following commands:

    sudo yum --disablerepo=* install /tmp/galera-4.rpm /tmp/socat.rpm
    sudo yum --disablerepo=* upgrade /tmp/MariaDB-server.rpm
  10. Remove incompatible backup packages:

    sudo yum remove percona-xtrabackup
  11. NOTE: If the "yum remove" command fails, it means that the package does not exist on the SL1 appliance. You can ignore the error message.

  12. Regenerate the configuration file for MariaDB:

    sudo /opt/em7/share/scripts/generate-my-conf.py -o -f  /etc/my.cnf.d/silo_mysql.cnf
  13. Restart MariaDB:

    sudo systemctl daemon-reload
    sudo systemctl start mariadb
  14. Restart the MySQL resource:

    sudo crm resource start mysql
  15. Restore the mariadb.service enabled state:

    sudo systemctl ${MSRV::-1} mariadb.service
  16. Upgrade the internal configuration for the database:

    sudo mysql_upgrade -u root -p
  17. Restart the em7 service:

    sudo systemctl start em7

Step 3: On the Secondary Database Server

  1. Save the current enabled state for the mariadb.service:

    export MSRV=`sudo systemctl is-enabled mariadb.service`
  2. Check the version of MariaDB-server that you are running:

    rpm -q MariaDB-server
  3. You must follow the steps that correspond to your version of MariaDB.

  4. MariaDB-server version 10.1.x. If you are running MariaDB-server version 10.1.x:
  5. Do these steps in order. Doing the steps in any other order will result in unintended consequences.

    1. Remove MariaDB-server:

      sudo rpm --nodeps -ev MariaDB-server
    2. Replace the Galera package and install the new MariaDB-server package by using the following commands:

      sudo yum --disablerepo=* swap -- remove galera -- install /tmp/galera-4.rpm
      sudo yum --disablerepo=* install /tmp/MariaDB-server.rpm
  6. MariaDB-server version 10.4.12 and higher. If you are running MariaDB-version 10.4.12 or higher, upgrade the MariaDB-server package and dependent packages (galera-4 and socat) by using the following commands:

    sudo yum --disablerepo=* install /tmp/galera-4.rpm /tmp/socat.rpm
    sudo yum --disablerepo=* upgrade /tmp/MariaDB-server.rpm
  7. Remove incompatible backup packages:

    sudo yum remove percona-xtrabackup
  8. NOTE: If the "yum remove" command fails, it means that the package does not exist on the SL1 appliance. You can ignore the error message.

  9. Regenerate the configuration file for MariaDB:

    sudo /opt/em7/share/scripts/generate-my-conf.py -o -f  /etc/my.cnf.d/silo_mysql.cnf
  10. Restore the mariadb.service enabled state:

    sudo systemctl ${MSRV::-1} mariadb.service
  11. Assume root privileges:

    sudo -s
  12. When prompted, enter the administrator password.
  13. At the shell prompt, enter the following command:

    coro config
  14. The following prompt appears:

    1) Disable Maintenance
    2) Option Disabled
    3) Promote DRBD
    4) Stop Pacemaker
    5) Resource Status
    6) Quit
  15. Enter "1".

Step 4: On the Disaster Recovery Database Server

  1. Open an SSH session to the Disaster Recovery Database Server.
  2. Assume root privileges:

    sudo -s
  3. When prompted, enter the administrator password.
  4. Save the current enabled state for the mariadb.service:

    export MSRV=`sudo systemctl is-enabled mariadb.service`
  5. Check the version of MariaDB-server that you are running:

    rpm -q MariaDB-server
  6. You must follow the steps that correspond to your version of MariaDB.

  7. MariaDB-server version 10.1.x. If you are running MariaDB-server version 10.1.x:
  8. Do these steps in order. Doing the steps in any other order will result in unintended consequences.

    1. Remove MariaDB-server:

      sudo rpm --nodeps -ev MariaDB-server
    2. Replace the Galera package and install the new MariaDB-server package by using the following commands:

      sudo yum --disablerepo=* swap -- remove galera -- install /tmp/galera-4.rpm
      sudo yum --disablerepo=* install /tmp/MariaDB-server.rpm
  9. MariaDB-server version 10.4.12 and higher. If you are running MariaDB-version 10.4.12 or higher, upgrade the MariaDB-server package and dependent packages (galera-4 and socat) by using the following commands:

    sudo yum --disablerepo=* install /tmp/galera-4.rpm /tmp/socat.rpm
    sudo yum --disablerepo=* upgrade /tmp/MariaDB-server.rpm
  10. Remove incompatible backup packages:

    sudo yum remove percona-xtrabackup
  11. NOTE: If the "yum remove" command fails, it means that the package does not exist on the SL1 appliance. You can ignore the error message.

  12. Regenerate the configuration file for MariaDB:

    sudo /opt/em7/share/scripts/generate-my-conf.py -o -f  /etc/my.cnf.d/silo_mysql.cnf
  13. Restore the mariadb.service enabled state:

    sudo systemctl ${MSRV::-1} mariadb.service

Manually Upgrading Standalone Database Servers, All-In-One Appliances, Data Collectors, and Message Collectors

To upgrade MariaDB on one or more Database Servers that are not configured for high availability or disaster recovery, a single All-In-One Appliance, one or more Data Collectors, or one or more Message Collectors, perform the following steps:

The Database Server, All-In-One Appliance, Data Collector, or Message Collector will be unavailable when performing these steps.

  1. Go to the console or open an SSH session to the SL1 appliance.
  2. Stop SL1 and mariadb:

    sudo systemctl stop em7
    sudo systemctl stop mariadb.service
  3. Save the current enabled state for the mariadb.service:

    export MSRV=`sudo systemctl is-enabled mariadb.service`
  4. Check the version of MariaDB-server that you are running:

    rpm -q MariaDB-server
  5. You must follow the steps that correspond to your version of MariaDB.

  6. MariaDB-server version 10.1.x. If you are running MariaDB-server version 10.1.x:
  7. Do these steps in order. Doing the steps in any other order will result in unintended consequences.

    1. Remove MariaDB-server:

      sudo rpm --nodeps -ev MariaDB-server
    2. Replace the Galera package and install the new MariaDB-server package by using the following commands:

      sudo yum --disablerepo=* swap -- remove galera -- install /tmp/galera-4.rpm
      sudo yum --disablerepo=* install /tmp/MariaDB-server.rpm
  8. MariaDB-server version 10.4.12 and higher. If you are running MariaDB-version 10.4.12 or higher, upgrade the MariaDB-server package and dependent packages (galera-4 and socat) by using the following commands:

    sudo yum --disablerepo=* install /tmp/galera-4.rpm /tmp/socat.rpm
    sudo yum --disablerepo=* upgrade /tmp/MariaDB-server.rpm
  9. Remove incompatible backup packages:

    sudo yum remove percona-xtrabackup
  10. NOTE: If the "yum remove" command fails, it means that the package does not exist on the SL1 appliance. You can ignore the error message.

  11. Regenerate the configuration file for MariaDB:

    sudo /opt/em7/share/scripts/generate-my-conf.py -o -f  /etc/my.cnf.d/silo_mysql.cnf
  12. Restart MariaDB:

    sudo systemctl daemon-reload
    sudo systemctl start mariadb
  13. Restore the mariadb.service enabled state:

    sudo systemctl ${MSRV::-1} mariadb.service
  14. Upgrade the internal configuration for the database:

    sudo mysql_upgrade -u root -p
  15. Restart the em7 service:

    sudo systemctl start em7
  16. Repeat all the steps in this section on each non-HA/DR Database Server, All-In-One Appliance, Data Collector, and Message Collector.

Additional Steps for MariaDB Upgrades in 10.1.x

SL1 10.1.x included an upgrade to MariaDB. The upgrade did not include a tool, jemalloc, that helps manage memory usage.

This section applies only to the following releases:

  • 10.1.0
  • 10.1.1
  • 10.1.2
  • 10.1.3
  • 10.1.4
  • 10.1.4.1
  • 10.1.4.2
  • 10.1.5
  • 10.1.5.1

For SL1 versions later than 10.1.5.1, jemalloc is included with the platform. For SL1 versions prior to 10.1.0, jemalloc is included with the platform.

To avoid problems with memory usage on Database Servers, perform the following steps after upgrading MariaDB for 10.1.x.

Perform these steps first on the active Database Server and then on each additional Database Server in your SL1 System.

  1. Open an SSH session to the Database Server.
  2. To verify that the Database Server is not currently running jemalloc, enter the following command at the shell prompt:

    silo_mysql -e 'show global variables like "version_malloc_library"'
  3. If the Database Server is not currently running jemalloc, the shell will display the following:

    Variable Name Value
    version_malloc_library system
  4. Search for the file /usr/lib64/libjemalloc.so.1.

    If the file does not exist, contact ScienceLogic Customer Support to request the file jemalloc-3.6.0-1.el7.x86_64.rpm.

    To install the RPM, use a file-transfer utility, copy the file to a directory on the SL1 appliance. Then enter the following commands at the shell prompt:

    cd /usr/lib64
    sudo yum install jemalloc-3.6.0-1.el7.x86_64.rpm
  5. Create the file /etc/systemd/system/mariadb.service.d/jemalloc.conf, as follows:

    vi /etc/systemd/system/mariadb.service.d/jemalloc.conf
  6. Add the following lines to the file:

    [Service]
    Environment="LD_PRELOAD=/usr/lib64/libjemalloc.so.1"
  7. Save and close the file.
  8. Reload the systemd config files with the following command:

    sudo systemctl daemon-reload
  9. Restart the Database Server:

    To restart the standalone Database Server or the primary Database Server in a cluster, enter the following:

    sudo systemctl restart mariadb

    To restart each secondary Database Server in a cluster:

    1. Open an SSH session to the secondary Database Server. At the shell prompt, enter:

      coro_config
    2. Select 1.
    3. When prompted to put the Database Server into maintenance, select y.
    4. Open an SSH session to the primary Database Server. To pause SL1, enter the following command at the shell prompt:

      sudo touch /etc/.proc_mgr_pause
    5. In the SSH session for the secondary Database Server, restart MariaDB:

      crm resource restart mysql
    6. After MariaDB has restarted successfully on the secondary Database Server, return to the SSH session on the primary Database Server. Remove the pause file for SL1 using the following command:

      sudo rm /etc/.proc_mgr_pause
    7. In the SSH session on the secondary Database Server, take the Database Server out of maintenance. At the shell prompt, enter:

      coro_config
    8. Select 1.
    9. When prompted to take the Database Server out of maintenance, select y.
  10. To verify that jemalloc is running on the Database Server, enter the following command at the shell prompt:

    silo_mysql -e 'show global variables like "version_malloc_library"'
  11. If the Database Server is currently running jemalloc, the shell will display something like the following:

    Variable Name Value
    version_malloc_library jemalloc 3.6.0-0-g46c0af68bd248b04df75e4f92d5fb804c3d75340
  12. Perform these steps on each Database Server in your SL1 system.

Rebooting Appliances in the SL1 Distributed Stack

Refer to the release notes for your current release to determine if you must reboot all SL1 appliances. Not every SL1 update requires rebooting.

When an upgrade requires a reboot, use the steps listed in this section to reboot all SL1 appliances in the Distributed stack.

Rebooting the Administration Portal

You can reboot Administration Portals either from the user interface or from the command line.

Rebooting Multiple Administration Portals

If your SL1 system includes multiple Administration Portals, you can remotely reboot Administration Portals from another Administration Portal. To do so:

  1. Go to the Appliance Manager page (System > Settings > Appliances).
  2. Select the checkboxes for the SL1 appliances you want to reboot.
  3. In the [Select Action] menu, select Reboot and click the Go button.
  4. Click the OK button when the "Are you sure you want to reboot the selected appliances?" message is displayed.
  5. During the reboot, the user interface for the affected Administration Portal unavailable.
  6. When the reboot has completed, the Audit Logs page (System > Monitor > Audit Logs) will include an entry for each appliance that was rebooted.

Rebooting a Single Administration Portal

If your SL1 system include only a single Administration Portal, perform the following steps to reboot that Administration Portal:

  1. Either go to the console of the Database Server or use SSH to access the Database Server.
  2. Log in as em7admin with the appropriate password.
  3. At the shell prompt, execute the following:

    python -m silo_common.admin_toolbox <appliance_ID> "/usr/bin/sudo /usr/sbin/shutdown -r +1"
  4. where:

  • appliance_ID is the appliance ID for the Data Collector, Message Collector, or Administration Portal.

Rebooting Data Collectors and Message Collectors

You can reboot Data Collectors and Message Collectors either from the user interface or from the command line.

Rebooting Data Collectors and Message Collectors from the Appliance Manager page

From the SL1 user interface, perform the following steps to reboot a Data Collector or a Message Collector:

  1. Go to the Appliance Manager page (Appliance Manager).
  2. Select the checkbox for each SL1 appliance you want to reboot.
  3. In the [Select Action] menu, select Reboot and click the Go button.
  4. Click the OK button when the "Are you sure you want to reboot the selected appliances?" message is displayed.
  5. During the reboot, go to the System Logs page (System > Monitor > System Logs). You should see this message:

  6. Major: Could not connect to module (5) database USING SSL=TRUE: Error attempting to connect to database with SSL enabled True: (2003, 'Can't connect to MySQL server on '10.2.12.77' (113 "No route to host")')

  7. When the reboot has completed, the Audit Logs page (System > Monitor > Audit Logs) will include an entry for each appliance that was rebooted.

Rebooting Data Collectors and Message Collectors from the Command Line

From the console of the Database Server or SSH to the Database Server, perform the following steps to reboot Data Collector or Message Collector:

  1. Either go to the console of a Database Server or SSH to access the Database Server.
  2. Log in as em7admin with the appropriate password.
  3. At the shell prompt, execute the following:

    python -m silo_common.admin_toolbox <appliance_ID> "/usr/bin/sudo /usr/sbin/shutdown -r +1"
  4. where:

    • appliance_ID is the appliance ID for the Data Collector, Message Collector, or Administration Portal.

Rebooting Standalone All-In-One Appliance and Standalone Database Server

Perform the following steps to reboot a standalone All-In-One Appliance or a standalone Database Server:

  1. Either go to the console or use SSH to access the SL1 appliance.
  2. Log in as em7admin with the appropriate password.
  3. On the SL1 appliance, pause the system and shutdown MariaDB.

    sudo touch /tmp/.proc_mgr_pause
    sudo systemctl stop mariadb
  4. Reboot the SL1 appliance:

    sudo reboot
  5. After the SL1 appliance has rebooted, either go to the console or use SSH to access the SL1 appliance.
  6. Log in as em7admin with the appropriate password.
  7. Un-pause the SL1 Appliance:

    sudo rm /tmp/.proc_mgr_pause

Rebooting Two Database Servers Configured for Disaster Recovery

Perform the following steps to reboot two Database Servers configured for Disaster Recovery:

  1. Either go to the console of the primary Database Server or use SSH to access the primary Database Server.
  2. Log in as em7admin with the appropriate password.
  3. Check the status of both Database Servers. To do this, enter the following at the shell prompt:

    cat /proc/drbd

    Your output will look like this:

    1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----
    ns:17567744 al:0 bm:1072 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:12521012
  4. Pause the system and shutdown MariaDB on the primary Database Server. Enter the password for the em7admin user when prompted:

    sudo touch /tmp/.proc_mgr_pause
    sudo systemctl stop pacemaker
    sudo systemctl stop mariadb
  5. Reboot the primary Database Server:

    sudo reboot
  6. After the primary appliance has rebooted, log in to the console of the primary Database Server again.
  7. Execute the following commands on the primary Database Server:

    sudo rm /tmp/.proc_mgr_pause
  8. Enter the password for the em7admin user and confirm the command when prompted.
  9. Log in to the secondary Database Server as the em7admin user using the console or SSH.
  10. Execute the following command on the secondary Database Server to reboot the appliance:

    sudo reboot
  11. Enter the password for the em7admin user when prompted.

Rebooting Two Database Servers in a High Availability Cluster

Perform the following steps to reboot two Database Servers in a high availability cluster:

  1. Either go to the console of the secondary Database Server or use SSH to access the secondary Database Server.
  2. Log in as em7admin with the appropriate password.
  3. Check the status of both Database Servers. To do this, enter the following at the shell prompt:

    cat /proc/drbd

    Your output will look like this:

  4. 1: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----

    ns:17567744 al:0 bm:1072 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:12521012

    If your output includes "ro:Secondary/Primary", but does not include "UpToDate/UpToDate", data is being synchronized between the two appliances. You must wait until data synchronization has finished before rebooting.

  5. Stop the cluster service on the secondary Database Server:

    sudo systemctl stop pacemaker
  6. Enter the password for the em7admin user when prompted.
  7. Either go to the console of the primary Database Server or use SSH to access the primary Database Server.
  8. Log in as em7admin with the appropriate password.
  9. Pause the system and stop the cluster service on the primary Database Server. Enter the password for the em7admin user when prompted:

    sudo touch /tmp/.proc_mgr_pause
    sudo systemctl stop pacemaker
    sudo systemctl stop mariadb
  10. Reboot the primary Database Server:

    sudo reboot
  11. After the primary Database Server has rebooted, either go to the console of the primary Database Server or use SSH to access the primary Database Server.
  12. Log in as em7admin with the appropriate password.
  13. Execute the following command on the primary Database Server:

    sudo rm /tmp/.proc_mgr_pause
  14. Enter the password for the em7admin user and confirm the command when prompted.
  15. Either go to the console of the secondary Database Server or use SSH to access the secondary Database Server.
  16. Log in as em7admin with the appropriate password.
  17. Reboot the secondary Database Server:

    sudo reboot
  18. Enter the password for the em7admin user when prompted.

Rebooting Three Database Servers Configured for High Availability and Disaster Recovery

Perform the following steps to reboot three Database Servers configured for high availability and disaster recovery. In this configuration, two Database Servers are configured as a High Availability cluster and one Database Server is configured for Disaster Recovery.

  1. Either go to the console of the secondary Database Server in the HA cluster or use SSH to access the secondary Database Server in the HA cluster,
  2. Log in as em7admin with the appropriate password.
  3. Check the status of both Database Servers in the HA cluster. To do this, enter the following at the shell prompt:

    cat /proc/drbd

    Your output will look like this:

  4. 10: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----

    ns:17567744 al:0 bm:1072 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:12521012

    If your output includes "ro:Secondary/Primary", but does not include "UpToDate/UpToDate", data is being synchronized between the two appliances. You must wait until data synchronization has finished before rebooting.

  5. Stop the cluster service with the following command on the secondary Database Server in the HA cluster:

    sudo systemctl stop pacemaker
  6. Enter the password for the em7admin user when prompted.
  7. Either go to the console of the primary Database Server in the HA cluster or use SSH to access the primary Database Server in the HA cluster.
  8. Log in as em7admin with the appropriate password.
  9. Pause the system and stop the cluster service on the primary Database Server in the HA cluster :

    sudo touch /tmp/.proc_mgr_pause
    sudo systemctl stop pacemaker
    sudo systemctl stop mariadb
  10. Enter the password for the em7admin user when prompted
  11. Reboot the primary Database Server in the HA cluster:

    sudo reboot
  12. After the primary Database Server in the HA cluster has rebooted, either go to the console of the primary Database Server in the HA cluster or use SSH to access the primary Database Server in the HA cluster.
  13. Execute the following command on the primary Database Server in the HA cluster:

    sudo rm /tmp/.proc_mgr_pause
  14. Enter the password for the em7admin user and confirm the command when prompted.
  15. Either go to the console of the secondary Database Server in the HA cluster or use SSH to access the secondary Database Server in the HA cluster.
  16. Log in as em7admin with the appropriate password.
  17. Reboot the secondary Database Server in the HA cluster:

    sudo reboot
  18. Enter the password for the em7admin user when prompted.
  19. Either go to the console of the Database Server for Disaster Recovery or use SSH to access the Database Server for Disaster Recovery.
  20. Log in as em7admin with the appropriate password.
  21. Reboot the Database Server for Disaster Recovery:

    sudo reboot
  22. Enter the password for the em7admin user when prompted.

Restoring Custom Settings in the NextUI

To restore the backup of the custom settings in the NextUI:

  1. Login to the console of the Database Server or SSH to the Database Server.
  2. Open a shell session.
  3. Enter the following at the shell prompt:

    cp /opt/em7/nextui/nextui.env.backup /opt/em7/nextui/nextui.env

Restoring the SSL Certificates

To restore your SSL Certificates:

  1. Login to the console of the Database Server or SSH to the Database Server.
  2. Open a shell session.
  3. Enter the following at the shell prompt:

    cp /etc/nginx/silossl.key.bak /etc/nginx/silossl.key
    cp /etc/nginx/silossl.pem.bak /etc/nginx/silossl.pem
  4. Repeat these steps on each Database Server in your SL1 system.

Resetting the Timeout for PhoneHome Watchdog

You can manually reset the settings for the PhoneHome Watchdog server back to the settings you used before the upgrade.

To edit the settings for the watchdog service:

  1. Log in to the console of the Data Collector as the root user or open an SSH session on the Data Collector.
  2. View your PhoneHome Watchdog settings:

    phonehome watchdog view

    Your output will look like the following:

    Current settings:
    autosync: yes
    interval: 120
    state: enabled
    autoreconnect: yes
    timeoutcount: 1
    check: default

    Note the settings for interval and timeoutcount, so you can restore them after the upgrade.

  3. To change the settings for SL1 upgrade, type the following at the command line:

    sudo phonehome watchdog set interval=<previous setting>;
    sudo phonehome watchdog set timeoutcount=<previous setting>;
    systemctl stop em7_ph_watchdog;
    systemctl start em7_ph_watchdog;
  4. Repeat these steps on each Data Collector.
  5. Repeat these steps on each Message Collector.
  6. Repeat these steps on each Database Server.

Updating Default PowerPacks

Every time you install a software update on your appliances, ScienceLogic recommends that you also install the updates for all the PowerPacks that were included in the software update.

ScienceLogic includes multiple PowerPacks in the default installation of SL1. When you apply an update to your system, new versions of the default PowerPacks will be automatically imported in to your system. If a PowerPack is included in an update and is not currently installed on your system, SL1 will automatically install the PowerPack. If a PowerPack is included in an update and is currently installed on your system, SL1 will automatically import (but not install) the PowerPack.

If PowerPacks have been imported into your system but have not been installed, the Update column appears in the PowerPack Manager page (System > Manage > PowerPacks). For each PowerPack that has been imported to your system but has not been installed, the lightning bolt icon () appears in the Update field on the PowerPack Manager page.

To install the updates for multiple PowerPacks

  1. Go to the PowerPack Manager page (System > Manage > PowerPacks) and click the checkbox for each PowerPack you want to install.
  2. In the Select Action drop-down field (in the lower right), choose Update PowerPack(s). SL1 displays a warning message before updating the PowerPack(s).
  3. Click the OK button to continue the installation.
  4. Click the Go button. If you completed the update, updated information about the PowerPack will appear in the PowerPack Manager page. All the items in the PowerPack will be installed in your SL1 system.

NOTE: You can install multiple PowerPacks with the Select Action drop-down list only if each selected PowerPack includes an embedded Installation Key. PowerPacks that do not include embedded Installation Keys will fail to install.

NOTE: If the Enable Selective PowerPack Field Protection checkbox on the Behavior Settings page (System > Settings > Behavior) is selected, certain fields in Event Policies, Dynamic Applications, and Device Classes will not be updated.

Configuring Subscription Billing

If your SL1 system is configured to communicate with the ScienceLogic billing server, usage data will be sent automatically from your SL1 system to the ScienceLogic billing server once a day. After the ScienceLogic billing server receives the usage data, SL1 will automatically mark the license usage file as delivered.

Sending usage data to the ScienceLogic billing server ensures that your bill is accurate and that ScienceLogic can continue making improvements to the SL1 products.

To determine if you have correctly configured Subscription Billing:

  • Go to the System Usage page (System > Monitor > System Usage) or (Manage > Subscription Usage). Click the Subscription button and choose License Data Delivery Status.
  • For air-gapped SL1 systems, the value of Summary Date should be within the past 48 hours.
  • For SL1 systems that connect to ScienceLogic, the value of Summary Date should be within the past 48 hours and the value of Delivery Status is 1.

For details on configuring subscription billing, see the section on configuring subscription billing.