Escalation Processes and Example Escalation Policy

Download this manual as a PDF file

This section describes sample escalation processes for acknowledging and clearing events, and includes an example of an automation policy that notifies staff if an event has not been acknowledged.

Typically, event escalation includes at least these three escalation processes:

  • Acknowledgment. When an event has been acknowledged, the acknowledging user's name appears in the Acknowledged column for the event on the Event Console page. This lets other users know that someone is investigating or taking action on the event. After acknowledgment, the acknowledging user can suppress the event. When a user suppresses an event, he or she specifies that, if this event occurs again on the same device, the event will not appear in the Event Console. This prevents the acknowledgment process from being reiterated.
  • Incident Response. After an event has been acknowledged (and optionally suppressed), you can then use ScienceLogic Ticketing or another incident response tool to monitor and document the actions required to resolve the event. For more information about managing incident response in SL1, see the section on Incident Management.
  • Resolution. When an event has been resolved, the resolving user can un-suppress the event and then clear it from the Event Console. When a user clears an event, he or she removes a single instance of the event from the system. If the event occurs again on the same device, it will reappear in the Event Console. The resolution ensures that the event won't occur again on the same device.

Use the following menu options to navigate the SL1 user interface:

  • To view a pop-out list of menu options, click the menu icon ().
  • To view a page containing all of the menu options, click the Advanced menu icon ().

Sample Escalation Process for Acknowledging Events

The following is a sample escalation process for acknowledging critical events:

Image of a sample escalation process for acknowledging Critical Events

 

  • Escalation #1. Operations. Events are initially handled by the Operations unit. If the Operations staff does not acknowledge a critical event within 10 minutes, the event escalates to the Director of Operations.
  • Escalation #2. Director of Operations. If the Director of Operations does not acknowledge a critical event within 10 minutes, the event escalates to a Customer Satisfaction Representative.
  • Escalation #3. Customer Satisfaction Representative. If the Customer Satisfaction Representative does not acknowledge a critical event within 10 minutes, the event escalates to the Director of Customer Service.
  • Escalation #4. Director of Customer Service. If the Director of Customer Service does not acknowledge a critical event within 15 minutes, the event escalates to a Tier-3 Support Engineer.
  • Escalation #5. Tier-3 Support Engineer. If the Tier-3 Support Engineer does not acknowledge a critical event within 15 minutes, the event escalates to the Chief Engineer.
  • Escalation #6. Chief Engineer. If the Chief Engineer does not acknowledge a critical event within 30 minutes, the event escalates to the Director of Implementation.
  • Escalation #7. Director of Implementation. If the Director of Implementation does not acknowledge a critical event within 30 minutes, the event escalates to the Vice President of Service Delivery.
  • Escalation #8. Vice President of Service Delivery. This is the final escalation point.

For major and minor events, the escalation process is similar, except that the time limit for each escalation is longer than the escalations for critical events.

Sample Escalation Process for Clearing Events

The following is a sample escalation process for clearing critical events:

Image of a sample escalation process for clearing Critical Events

 

  • Escalation #1. Operations. Events are initially handled by the Operations unit. If the Operations staff does not resolve a critical event within 10 minutes, the event escalates to the Director of Operations.
  • Escalation #2. Director of Operations. If the Director of Operations does not resolve a critical event within 10 minutes, the event escalates to a Customer Satisfaction Representative.
  • Escalation #3. Customer Satisfaction Representative. If the Customer Satisfaction Representative does not resolve a critical event within 10 minutes, the event escalates to the Director of Customer Service.
  • Escalation #4. Director of Customer Service. If the Director of Customer Service does not resolve a critical event within 15 minutes, the event escalates to a Tier-3 Support Engineer.
  • Escalation #5. Tier-3 Support Engineer. If the Tier-3 Support Engineer does not resolve a critical event within 15 minutes, the event escalates to the Chief Engineer.
  • Escalation #6. Chief Engineer. If the Chief Engineer does not resolve a critical event within 30 minutes, the event escalates to the Director of Implementation.
  • Escalation #7. Director of Implementation. If the Director of Implementation does not resolve a critical event within 30 minutes, the event escalates to the Vice President of Service Delivery.
  • Escalation #8. Vice President of Service Delivery. This is the final escalation point.

For major and minor events, the escalation process is similar, except that the time limit for each escalation is longer than the escalations for critical events.

Defining Escalation Policies

SL1 includes the Automation Policy Editor and the Action Policy Editor, which allow you to define escalation policies based upon event severity, elapsed time, and event status (for example, event acknowledged, ticket assigned, event cleared). When specified conditions are met, SL1 automatically performs one or more actions. The action in this example notifies specified team members through email.

For details on defining automation, see the section on Run Book Automation.

Example Escalation Policy for Event Acknowledgment

This section shows how to use the Automation Policy Editor and Action Policy Editor to create an escalation policy for event acknowledgment.

Creating the Action Policy

Using the escalation processes from the section on Sample Escalation Processes for Event Acknowledgment, you can first create an action policy that sends an email message to the Director of Operations.

To create this action policy:

  1. Go to the Action Policy Manager page (Registry > Run Book > Actions).
  2. From the Action Policy Manager page, click the Create button. The Action Policy Editor page appears.
  3. In the Action Policy Editor page, supply values in the following fields:
  • Action Name. Type "event_escalation_Dir_of_Ops".
  • Description. Type "Email to Director of Operations".
  • Action Type. Select Send an Email Notification.
  • Email Subject. At the beginning of the field, type "Not Acknowledged: " and leave the other values in the field. The entire field should read "Not Acknowledged: %S Events: %M".
  • Available Emails. We selected the email address for our example Director of Operations, em7admin: mjtest@sciencelogic.com. If you want to see the emails that result from this action policy, you can select your own email address in this field. After selecting an email address, click the >> button to add it to the Assigned Emails field.
  • For all other fields, accept the default values.
  1. Click the Save button to save the new action policy.

To create additional action policies for all the steps in section on Sample Escalation Processes for Event Acknowledgment, perform the steps above, but supply the following values:

Action Name Available Emails
event_escalation_CS_rep Select the appropriate email address for a Customer Satisfaction Representative. If you want to see the emails that result from this action policy, you can select your own email address in this field.
event_escalation_Dir_of_CS Select the appropriate email address for the Director of Customer Service.
event_escalation_tier3 Select the appropriate email address for a Tier-3 Support Representative.
event_escalation_chief_eng Select the appropriate email address for a Chief Engineer.
event_escalation_Dir_of_Impl Select the appropriate email address for a Director of Implementation.
event_escalation_VP_of_Service Select the appropriate email address for a Vice President of Service Delivery.

Creating the Automation Policy

Using the escalation processes from the section on Sample Escalation Processes for Event Acknowledgment, you can create an automation policy that sends an email to the Director of Operations when an event has not been acknowledged for 10 minutes.

To create this automation policy:

  1. Go to the Automation Policy Manager page (Registry > Run Book > Automation).
  2. Click the Create button. The Automation Policy Editor page appears.
  3. Supply the following values in the following fields:
  • Policy Name. Type "event_not_acknowleged_10_minutes".
  • Organization. Select System. This automation policy will act on all events in your SL1 system.
  • Criteria Logic. These fields specify the conditions that must be met before the system executes the action specified in the automation policy. All conditions must be met for at least one of the selected events on at least one of the selected devices.
  • Severity Operator. Select Severity =.
  • Severity. Select Critical.
  • Elapsed time. The length of time that must elapse after the event occurs but before the system evaluates the other criteria in the automation policy. Select and 10 minutes has elapsed.
  • Status. Event must have the specified status. Select and event is NOT acknowledged.
  • Available Actions. Select the action policy you defined in the Creating the Action Policy section, Send Email: event_escalation_Dir_of_Ops.  Click on the >> button. The selected action policy will appear in the Aligned Actions field.
  • For all other fields, accept the default values.
  1. Click the Save button to save the new automation policy. Now when an event occurs with a severity of Critical, on any device, and that event is not acknowledged within ten minutes, the system sends an email to the Director of Operations.

To create additional automation policies for all the steps in the section on Sample Escalation Processes for Event Acknowledgment, perform the steps above, but supply the following values:

Policy Name Elapsed Time Available Actions
event_not_acknowleged_20_minutes and 20 minutes have elapsed. event_escalation_CS_rep
event_not_acknowleged_30_minutes and 30 minutes have elapsed. event_escalation_Dir_of_CS
event_not_acknowleged_45_minutes and 45 minutes have elapsed. event_escalation_tier3
event_not_acknowleged_60_minutes and 1 hour has elapsed event_escalation_chief_eng
event_not_acknowleged_90_minutes and 1 hour 30 minutes has elapsed event_escalation_Dir_of_Impl
event_not_acknowleged_120_minutes and 2 hours has elapsed. event_escalation_VP_of_Service

Example Email and Example Logs

When the system generates an event with a severity of "Critical" and the event is not acknowledged within 10 minutes, the system automatically sends an email, as defined in the example policy above.

In the Events page, you can view the escalation actions by clicking the number hyperlink in the Automated Actions column for a critical event.

If you are using the SL1 classic user interface, you can go to the Event Console and view the escalation actions by clicking the mail icon () for a critical event.

The user interface displays the Event Actions Log page, where you can view a record of the escalation action.