A Service Level Agreement
(SLA) is an understanding between two parties, usually a service provider and a customer, in which an expected level of service
is formally defined. It can be legally binding and refers to the contracted performance, or delivery time, of the service. How this is defined varies depending on the service being provided. For example, consider an online advertiser who provides ad content as the service provider and the owner of the website the ad is placed on as the customer. The customer requires the advertisement to be delivered within 2 seconds. This would be defined in the SLA, with some penalty, usually financial, if the expected response time exceeds 2 seconds from multiple locations. The specific terms, responsibilities, and requirements are worked out between the parties and are different for each situation.
AlertSite calculates response times measured from multiple monitoring locations simultaneously
, assuring SLA compliance and allowing comparison of actual performance with designated SLA objectives, operating periods, and compliance reporting exclusions.
AlertSite provides a simple way for customers to configure SLA objectives for site uptime, availability, and response time that match the SLA contract for devices that are using the SLA (MultiPOP) monitoring type. This guide assumes the reader is familiar with how to use and navigate through the AlertSite console.
- Uptime: your website is "up" when any monitoring location can successfully access your site (returns a non-error code) during the monitoring interval
- Availability: the percentage of successful measurements out of the total measurements of your site from all your monitoring locations during the report time frame
- Response Time: the time it takes the monitoring location to access your website and return from the GET request
- Error Correlation Technology (ECT): A proprietary AlertSite feature that recognizes errors at all monitoring locations simultaneously and correlates the results for accurate reporting
The values are an average
over the selected time frame. For example, say you were monitoring from 3 locations, checking a time frame that included 100 measurements. One of your locations was unable to access your site 5 times during that period (95% availability from that location). The other 2 locations always had access (100% availability from each of those 2 locations). Your uptime
would be 100%, while the availability
would be 98.33%:
(100+100+95)/3 = 295/3 = 98.33%
First, create a site device in your AlertSite Console by navigating to the Configuration
screen, clicking the Add a new site
button, and filling in the on-line form. The Site Type
must be either an SLA Performance
plan or Usage Based Monitoring
plan, and the Monitoring Type
must be SLA (MultiPOP)
You can select as many locations as you like from the Locations list from the device's Locations table, accessed by clicking on the Locations
button in the upper right of the configuration screen, but a minimum
of 2 monitoring locations is required. You can also elect to rotate among your selected monitoring locations.
Back to top
In order to prove that your site is operating within the SLA requirements, you need to set up SLA Objectives
. AlertSite uses its proprietary Error Correlation Technology (ECT) to report when all monitoring locations detect an error simultaneously, rendering the site unavailable. ECT will also determine if the site is up. Uptime statistics are especially useful in management of SLAs since they can accurately reveal if the web service was at all available. Setting objectives enables you to show that your site was in compliance during any selected time frame within your data retention period.
The Configuration: SLA Objectives
page allows you to set service-level objectives for uptime, availability, and overall response time. In addition, operational periods can be defined for specific time periods after the fact for SLA compliance reporting, for example, during downtime for scheduled periodic maintenance. One-time exclusions for single-event downtimes can also be defined.
The Operating Periods
section in the SLA Manager
screen defines both inclusion and exclusion periods. Only time intervals listed in this section are included in the SLA Report, while time intervals not defined in this section are excluded from the SLA Report. Setting Operating Periods
and One-Time Exclusion Periods
does not halt monitoring, which is accomplished with Blackouts.
The SLA Objectives screen is only available for devices that are configured with the SLA (MultiPOP)
monitoring type. This illustration shows two SLA devices with configured SLA Objectives, and one without:
The figure below displays how device SLA Home Page
is configured, with a minimum of 99% Availability, 98% Uptime, and a Response Time expected to be less than or equal to .20 seconds over an operating period of Monday through Friday. There are no exclusions, either weekly or one-time:
With this configuration, an SLA Report, available only for SLA devices that have SLA Objectives configured, will display a table showing whether or not your site was in compliance with the objectives and the the number of errors and checks in the selected reporting time frame:
As the values in the Errors / Checks
column above show, out of 27974 checks done during the selected time frame, there were 9 Response Time
errors, i.e., the response time was higher than the goal average 9 times. However, since the actual
average response time is still lower than the objective, the site is in SLA compliance. Note that the number of Uptime
checks shown is lower than the number of Availability
and Response Time
checks. This is because as long as one location can access the site successfully, that's all that's included in the statistics.
Back to top
You can control the sensitivity of alerts you receive when your site is out of compliance, above and beyond normal notification of timeouts, keyword errors, or tcp connection errors. You can elect to be notified if 2, 3, or all of your monitoring locations detect an out-of-compliance condition from the Preferences
section of the Account
→ Manage Accounts
If you want to be sure that you are notified when, say, 2 of your monitoring locations have detected a response time higher than the configured goal during the monitoring interval, you would select Send notification when TWO locations detect an error
from the dropdown. If an out-of-bounds response time is detected from only 1 location during the interval, you would not be notified. The condition would be reported as an error in the SLA Report, but you would not receive an alert.
exclude measurements gathered in an SLA device during a specified time interval from your SLA Report. SLA Exclusions should not be confused with Blackouts. Blackouts actually disable monitoring (or notifications) for a device during a specified time interval.
With SLA Exclusion, monitoring takes place but those measurements are not included in SLA calculations. A Blackout does not have to be applied to that SLA device for that time period in order to exclude those measurements from the SLA Report. However, if you do have Blackouts set for an SLA device, it reduces the number of checks displayed in the Errors/Checks
column under Service Level Objectives
in the SLA Manager
If you need to remove monitoring checks from the SLA Report, you can add a One-Time Exclusion Period
. For example, say you forgot to configure a Blackout for a planned UPS replacement and monitoring was done during the 10 minutes the system was down. You can configure a One-Time Exclusion Period
to remove that time period from the SLA Report, and document the event right in the report.
Applying SLA Objectives to Other Devices
If you have other SLA devices that need to have the same SLA Objectives applied, once you have set up one device, you can go to Configuration
→ Bulk Settings
and use the one device as a template for any other devices. Please go to the Bulk Settings
help page and click on SLA Settings
in the bullet list at the top.
Back to top