Observability Engineer
Build and maintain observability solutions across metrics, logs, and traces.
Implement and support monitoring tools such as Prometheus, Grafana, ELK Stack, Splunk, Datadog, or New Relic.
Develop dashboards and alerts to surface meaningful operational insights.
Create and maintain telemetry pipelines and integrations using OpenTelemetry or equivalent frameworks.
Support incident detection, response, and root cause analysis through observability data.
Define and measure SLIs/SLOs for key services to drive reliability goals.
Assist in tuning and optimizing alert thresholds and anomaly detection logic.
Key Tools: Datadog, SolarWinds, Netbrain, THousand eyes, Open Manager, Aria, Dynatrace, Grafana.
Scirpting Language: Python, Powershell, HTML, Terraform, Ansible, or CI/CD pipelines