Senior System Support Engineer
Senior Service Engineer, Incident Commander - ESI
Organization Background:
The Engineering Services & Infrastructure team provides foundational infrastructure of Cloud Services, Network Connectivity, Data Center Hosting, Voice, Collaboration Applications, and End User Computing enables our caregivers, patients, physicians, and Community Technology partners to achieve our mission. We drive Cloud migrations, Network modernization and infra modernization across the company.
What will you be responsible for?
- LeaEnd to End Major incident management process effectively.
- Drive structured troubleshooting, escalation, and decision‑making during high‑pressure situations
- Clear stake holder & leadership communication.
- Accountable for the overall quality of the process and in compliance with the procedures, data models, policies, and technologies associated with the process.
- Ensure creation of incident timelines, root cause summaries, and repair items
- Lead or support Major Incident Reviews (MIRs) and post‑mortem discussions.
- Ensure creation of incident timelines, root cause summaries, and repair items
- Lead or support Major Incident Reviews (MIRs) and post‑mortem discussions.
What would your day look like?
- Identify critical impacting issues, Drive bridges effectively, pull SMEs and drive end to end incident, document steps for troubleshooting, and send timely communications.
- Auditing of incident tickets for ensuring proper documentation and follow up with Service Lines if found non-compliant
- Very strong in creating process and technical documentation
- Ensure the incidents are tracked with correct categorization and prioritization.
- Ensure that activities within a process are being performed at a high level of quality and that it meets its associated Service Level Agreements or Operational Level Agreements.
- Identify Incidents for review and participate in incident review meeting.
- Coordinating with Service owners on repetitive issues and driving the rootcause by following the 5Y method.
- Works in conjunction with Continual Service Improvement (CSI)
- Establishing measurements and targets to improve process effectiveness and efficiency.
- Utilize PowerBI dashboard to build interactive and visually appealing dashboards and reports.
- Managed Agile feature tasks and subtasks, facilitated to clear impediments.
- Following 30,60 and 90 model to all the new caregivers for effective knowledge transfer.
- Focus on alert reduction and driving support teams towards permanent fix for repetitive alerts
- Adopt AI& ML learnings for quick trends analysis & repetitive patterns.
Who are we looking for?
- A bachelor's degree in computer science or a related field education/experience.
- Well versed with ITIL, Devops and Agile model.
- 5-8 years of experience in Incident and major incident management.
- 4+ years of experience in monitoring, Incident management and all modules under ITIL.
- Hands on knowledge on resolving server, storage, database, and network issues & application issues.
- Expertise in ServiceNow, monitoring tools such as SCOM, Solarwinds, Netapp
- Experience in administration of workloads on Microsoft Azure (IaaS)
- Strong communication skills with excellent interpersonal skills both in written and verbal correspondence.
- Flexibility to work in 16/7 shifts (no night shifts) and on holidays.
- A bachelor's degree in computer science or information science or related field