People -
EAPEN organizes its Network Operations Center by teams, headed by a team leader and a service delivery manager, with a team handling a certain number of clients creating a single point of contact and accountability. Besides the teams there is a server expert group with extensive expertise with applications such as Exchange, SQL Server, and Citrix (we support over 40 commercial applications). Finally, there is a security research group that analyzes patches and determines which machines they are to be deployed to as well as keeping track of releases by antivirus vendors regarding updates and zero-day vulnerabilities.
Infrastructure -
The Managed Services Technology (MST) solution that drives EAPEN's service is hosted at co-location facilities in Fremont, CA (Hurricane Electric) and Pittsburgh, PA (Expedient Communications). Security has been a key area of emphasis at EAPEN with measures including but not limited to biometric access control, video surveillance and access control auditing.
Systems -
EAPEN has built a variety of technology and processes to provide a high level of proactive maintenance from its NOC. The first being the early warning system and pattern detection engine which is a database of 16,000 different error conditions with corresponding resolutions. When our software detects events from logs, performance counters, and registry values based on the database of such error conditions it automatically creates a ticket for the NOC engineer who will then take action based on the corresponding resolution and escalation matrix.
Secondly, the NOC operations system automatically routes problems by engineers and has a SLA tracker which is used by the service managers to deliver timely service and to prevent issues from falling through the cracks. Engineers performing any troubleshooting or analysis follow a “runbook" that defines actions to be taken against different problems and anything falling outside of this runbook is immediately escalated to the server expert group for action.
These systems have evolved over years of providing service and are being constantly updated to improve the quality of service. All interaction with Field Engineers is via a ticketing system except for emergency situations such as server down or disk failure issues where phone calls are initiated. This discipline minimizes the chance of an alert being missed or a problem not being addressed on time.
|