Senior Service Engineer
Organization Background:
The Technology Engineering and Ops team provides foundational infrastructure of Cloud Services, Network Connectivity, Data Center Hosting, Voice, Collaboration Applications, and End User Computing enables our caregivers, patients, physicians, and Community Technology partners to achieve our mission. We drive Cloud migrations, Network modernization and infra modernization across the company.
What will you be responsible for?
- Manage the Workflow Process for all the major Incidents/Situations (P1/P2/P3)
- Monitor, triage, resolve Alerts and incidents from various systems. Ability to correlate alerts and respond appropriately.
- Hands on experience in monitoring tools such as SCOM and/or AppInsights.
- Ability to use Power BI dashboards for reporting.
What would your day look like?
- Identify critical impacting issues, Open bridge, pull SMEs, capture timelines and drive end to end incident, document steps for troubleshooting.
- Investigate and diagnose Major Incident to restore failed Application or Service as quickly as possible.
- Notifying timely updates via SMS & notifications to the service owners/application owners and caregivers.
- Excellent written and verbal communication and organizational skills
- Following defined escalation path when needed, as defined in the escalation policy.
- Ensure that Incidents are assigned to the right Support groups and are resolved on time.
- Ensure that activities within a process are being performed at a high level of quality and that it meets its
associated Service Level Agreements or Operational Level Agreements.
- Acting as focal point for the process, communicating with service owners and management.
- Monitor the health of infrastructure systems of the organization to minimize the critical impact
- Troubleshoot enterprise level infrastructure issues, solving business impacting issues and day-to-day incidents
- Well versed with ITIL processes (Incident, Change and Problem)
- Compile, analyze and report on Incident data to leadership team
- Good exposure on Service now tool.
- Well versed with the Agile process in creating and tracking ADO’s
Who are we looking for?
- 4+ years of experience in monitoring and Incident management
- Expertise in ITIL (incident and event management), Agile and Devops process
- Expertise in monitoring tools such as SCOM and/or AppInsights
- Familiarity with Cisco, Juniper, and Palo Alto devices, Network gear configuration
- Understanding of LAN, ports & services, subnets, routing etc.
- Strong communication skills with excellent interpersonal skills both in written and verbal correspondence
- Flexibility to work in 16/7 shifts (no night shifts) and on holidays
- Ability to work in ambiguous work environment with global team setup
- Experience working in feature crew model, knowledge of Agile framework
- Ability to learn and adapt to changing landscape and acquire new skills with technology advancement
- Azure Certifications: Az 900 recommended
- A bachelor's degree in computer science or information science or related field