Senior Service Engineer
Organization Background:
The Cloud & Hosting Engineering team drives cloud migrations, engineering and OnPrem hosting, compute, storage data protection, data engineering for the Technology, Engineering & Operations organization. We support OnPrem and Azure cloud infra running Windows, Linux VM’s and SQL, Oracle databases.
What will you be responsible for?
Primary Responsibility: Manage and ensure the availability of OnPrem compute infrastructure, with a strong focus on virtualization technologies such as VMware, Hyper-V, and Windows environments.
- Ensure availability and reliability of OnPrem and Cloud infrastructure, including deployment, management, and governance of virtualized & Cloud environments.
- Analyze, troubleshoot, and resolve issues within VMware, Hyper-V, Azure and Windows platforms efficiently and effectively.
- Perform error resolution and root cause analysis (RCA) for major incidents affecting OnPrem systems.
- Present key performance indicators (KPIs) related to OnPrem operations to stakeholders.
- Demonstrate a growth mindset by learning and working with core technologies, particularly Windows Server, VMware, and Hyper-V.
- Deploy, manage, and govern Azure Cloud infrastructure as needed.
- Automate cloud deployments using Infrastructure as Code (IaC) tools such as Terraform, Bicep, or ARM templates, and manage configurations using Ansible.
- Build and modernize solutions in the Azure Cloud for compute environments.
- Apply engineering and automation skillsets to streamline operations both OnPrem and in the cloud.
- Modernize infrastructure by adopting Service Reliability Engineering (SRE) principles and DevOps tools and practices.
- Support Azure cloud deployments and management using IaC as required by project demands.
What would your day look like?
- Take ownership of managing robust OnPrem, Azure Hosting, and Compute environments—including Windows, Linux, and Kubernetes—with a focus on high availability, scalability, and security.
- Leverage the latest Service Reliability Engineering (SRE) tools to proactively monitor, automate, and enhance infrastructure performance, driving operational excellence across platforms.
- Automate configuration management and drift remediation through advanced solutions like Ansible, ensuring consistent and compliant environments.
- Design and implement cloud deployment pipelines and operational processes using cutting-edge tools such as Terraform and Azure Pipelines to accelerate delivery and minimize manual intervention.
- Actively work on incidents, changes, service requests, and complex engineering projects—contributing to both day-to-day stability and strategic innovations.
- Manage and optimize virtualization platforms like VMware and Hyper-V, ensuring seamless operations for diverse workloads.
- Champion “Get Current, Stay Current” initiatives to keep technology stacks modern, secure, and aligned with industry best practices.
- Utilize data-driven insights to identify opportunities for optimization and modernization, continuously improving infrastructure efficiency and reliability.
- Empower users with self-service automation capabilities, significantly reducing manual workloads and enabling rapid service delivery.
- Demonstrate expertise in scripting and automation using platforms such as Bash, Shell, Powershell and Ansible, streamlining routine processes and accelerating innovation.
- Lead multiple projects simultaneously, engaging with stakeholders, delivering transparent reporting, and creating impactful dashboards to visualize key metrics and progress.
- Establish, document, and share standards, best practices, and operational procedures—maintaining a centralized and accessible repository for knowledge management.
Who are we looking for?
- Minimum 4-7 years of experience in managing enterprise Compute- On Prem and Cloud
- Well versed with Linux/Unix or Windows with Virtualization/Azure Experience
- Must have experience in handling enterprise scale compute environment with exp in troubleshooting, performance tuning skills
- Must have experience writing scripts in any of BASH/Shell/Powershell or Ansible
- Must have experience in migrating OnPrem environment to Cloud, preferably Azure
- Experience in DevOps tools like Terraform, Github Actions, Ansible or Azure DevOps
- OS design, installation, configuration, and support in an Enterprise environment.
- Strong networking knowledge (LAN/WAN, IP addressing & Subnetting, HTTP(S), FTP, SMTP, VPN)
- Experience in troubleshooting, implementing operating systems enhancements to improve reliability and performance.
- Ability to work in ambiguous work environment with global team setup
- Experience working in feature crew model, knowledge of Agile framework
- Ability to learn and adapt to changing landscape and acquire new skills with technology advancement
- Strong communication skills with excellent interpersonal skills both in written and verbal correspondence