Troubleshooting

Download this manual as a PDF file 

This section contains troubleshooting resources and procedures to use with the Google Cloud Platform SyncPack.

Initial Troubleshooting Steps

The first step taken when troubleshooting should be to ensure that there are no issues with the data platforms with which PowerFlow is interacting. There might be additional configurations or actions enabled on Google Cloud or SL1 that result in unexpected behavior. Perform the steps in the following sections to troubleshoot SL1 PowerFlow and Google Cloud. For detailed information about how to perform the steps below, see Resources for Troubleshooting.

SL1 PowerFlow

  1. Run docker service ls on the PowerFlow server.
  2. Note the Docker container version, and verify that the Docker services are running.
  3. If a certain service is failing, make a note the service name and version.
  4. Run docker service ps <service_name> to see the historical state of the failing service and make a note of this information. For example: docker service ps iservices_contentapi
  5. Make a note of any logs impacting the service by running docker service logs <service_name>. For example: docker service logs iservices_couchbase

Google Cloud Platform

  1. Make a note of the SyncPack version, if applicable.
  2. Make a note of the Google Cloud application that is failing on PowerFlow.
  3. Make a note of what step is failing in the application. Try running the application in debug mode, and capture any traceback or error messages that occur in the step log.

Resources for Troubleshooting

This section contains port information for PowerFlow and troubleshooting commands for Docker, Couchbase, and the PowerFlow API.

Useful PowerFlow Ports

  • https://<IP of PowerFlow>:8091. Provides access to Couchbase, a NoSQL database for storage and data retrieval.
  • https://<IP of PowerFlow>:15672. Provides access to the RabbitMQ Dashboard, which you can use to monitor the service that distributes tasks to be executed by PowerFlow workers. Use the following for login: guest guest for the login.
  • Username: guest
  • Password: guest
  • https://<IP of PowerFlow>/flower. Provides access to Flower, a tool for monitoring and administrating Celery clusters.

For version 2.0.0 and later of PowerFlow, port 5556 must be open for both PowerFlow and the client.

Helpful Docker Commands

PowerFlow is a set of services that are containerized using Docker. For more information about Docker, see the Docker tutorial.

Use the following Docker commands for troubleshooting and diagnosing issues with PowerFlow:

Viewing Container Versions and Status

To view the PowerFlow version, SSH in to your PowerFlow instance and run the following command:

docker service ls

In the results, you can see the container ID, name, mode, status (see the replicas column), and version (see the image column) for all the services that make up PowerFlow.

Restarting a Service

Run the following command to restart a single service:

docker service update --force <service_name>

Stopping all PowerFlow Services

Run the following command to stop all PowerFlow services:

docker stack rm iservices

Restarting Docker

Run the following command to restart Docker:

systemctl restart docker

Restarting Docker does not clear the queue.

Viewing Logs for a Specific Service

You can use the Docker command line to view the logs of any currently-running service in the PowerFlow cluster. To view the logs of a service, run the following command:

docker service logs -f iservices_<service_name>

Some common examples include the following:

docker service logs –f iservices_couchbase

docker service logs –f iservices_steprunner

docker service logs –f iservices_contentapi

Application logs are stored on the central database as well as on all Docker hosts in a clustered environment. These logs are stored at /var/log/iservices for both single-node or clustered environments. However, the logs on each Docker host relate only to the services running on that host. For this reason, using the Docker service logs is the best way to get logs from all hosts at once.

Clearing RabbitMQ Volume

RabbitMQ is a service that distributes tasks to be executed by PowerFlow workers.

The following error message might appear if you try to run a PowerFlow application via the API:

Internal error occurred: Traceback (most recent call last):\n File \"./content_api.py\", line 199, in kickoff_application\n task_status = ... line 623, in _on_close\n (class_id, method_id), ConnectionError)\nInternalError: Connection.open: (541) INTERNAL_ERROR - access to vhost '/' refused for user 'guest': vhost '/' is down

First, verify that your services are up. If there is an issue with your RabbitMQ volume, you can clear the volume with the following commands:

docker service rm iservices_rabbitmq

docker volume rm iservices_rabbitdb

If you get a message stating that the volume is in use, run the following command:

docker rm <id of container using volume>

Re-deploy PowerFlow by running the following command:

docker stack deploy -c /opt/iservices/scripts/docker-compose.yml iservices

Restarting Docker does not clear the queue, because the queue is persistent. However, clearing the queue with the commands above might result in data loss due to the tasks being removed from the queue.

Viewing the Process Status of All Services

Run the following command:

docker ps

Deploying Services from a Defined Docker Compose File

Run the following command:

docker stack deploy -c <compose-file> iservices

Dynamically Scaling for More Workers

Run the following command:

docker service scale iservices_steprunner=10

Completely Removing Services from Running

Run the following command:

docker stack rm iservices

Diagnostic Tools

Multiple diagnostic tools exist to assist in troubleshooting issues with the PowerFlow platform:

  • Docker PowerPack. This PowerPack monitors your Linux-based PowerFlow server with SSH (the PowerFlow ISO is built on top of an Oracle Linux Operating System). This PowerPack provides key performance indicators about how your PowerFlow server is performing. For more information on the Docker PowerPack and other PowerPacks that you can use to monitor PowerFlow, see the Using SL1 to Monitor SL1 PowerFlow section.
  • Flower. This web interface tool can be found at the /flower endpoint. It provides a dashboard displaying the number of tasks in various states as well as an overview of the state of each worker. This tool shows the current number of active, processed, failed, succeeded, and retried tasks on the PowerFlow platform. This tool also shows detailed information about each of the tasks that have been executed on the platform. This data includes the UUID, the state, the arguments that were passed to it, as well as the worker and the time of execution. Flower also provides a performance chart that shows the number of tasks running on each individual worker.
  • Debug Mode. All applications can be run in "debug" mode via the PowerFlow API. Running applications in debug mode may slow down the platform, but they will result in much more detailed logging information that is helpful for troubleshooting issues. For more information on running applications in Debug Mode, see Retrieving Additional Debug Information.
  • Application Logs. All applications generate a log file specific to that application. These log files can be found at /var/log/iservices and each log file will match the ID of the application. These log files combine all the log messages of all previous runs of an application up to a certain point. These log files roll over and will get auto-cleared after a certain point.
  • Step Logs. Step logs display the log output for a specific step in the application. These step logs can be accessed via the PowerFlow user interface by clicking on a step in an application and bringing up the Step Log tab. These step logs display just the log output for the latest run of that step.
  • Service Logs. Each Docker service has its own log. These can be accessed via SSH by running the following command:

docker service logs -f <service_name>

Retrieving Additional Debug Information (Debug Mode)

The logs in PowerFlowuse the following loglevel settings, from most verbose to least verbose:

  • 10. Debug Mode.
  • 20. Informational.
  • 30. Warning. This is the default settings if you do not specify a loglevel.
  • 40. Error.

If you run applications with "loglevel": 10, those applications will take longer to run because of increased I/O requirements. Enabling debug logging using the following process is the only recommended method. ScienceLogic does not recommend setting "loglevel": 10 for the whole stack with the docker-compose file.

To run an application in Debug Mode using the PowerFlow user interface:

  1. Select the PowerFlow application from the Applications page.
  2. Hover over the Run () from and select Debug Run from the pop-up menu. PowerFlow executes the application in Debug Mode with a log level of 10.

To run an application in Debug Mode using the API:

  1. POST the following to the API endpoint:

https://<PowerFlow>/api/v1/applications/run

  1. Include the following in the request body:

{

"name": "<application_name>",

"params": {

"loglevel": 10

}

}

After running the application in Debug Mode, review the step logs in the PowerFlow user interface to see detailed debug output for each step in the application. This information is especially helpful when trying to understand why an application or step failed:

You can also run an application in debug using curl via SSH:

  1. SSH in to the PowerFlowinstance.

  2. Run the following command:

curl -v -k -u isadmin:em7admin -X POST "https://<your_hostname>/api/v1/applications/run" -H 'Content-Type: application/json' -H 'cache-control: no-cache' -d '{"name": "interface_sync_sciencelogic_to_servicenow","params": {"loglevel": 10}}'