Lead Cybersecurity Product Engineer
About the RoleWe are looking for a hands‑on Data Engineer to design, build, and operate robust data pipelines and platforms on Snowflake with Azure. You will use strong SQL, Python/PySpark, ADF pipelines, and modern data‑modeling practices to ingest data from diverse data sources, and enable AI/ML use‑cases via VectorDB indexing and embeddings. The role emphasizes reliability, performance, cost‑efficiency, and secure data operations in line with our enterprise platforms and standards.Key ResponsibilitiesDesign & build data pipelines on Snowflake and Azure (ADF, PySpark) to ingest data from REST APIs, files, and databases into curated zones.Model data optimized for analytics, reporting, and downstream applications.Develop embeddings & VectorDB indices to power semantic search/retrieval (e.g., generating embeddings and indexing into enterprise‑approved vector stores; integrate with pipeline orchestration).Own performance & cost optimization in Snowflake (SQL tuning, partitioning, caching, clustering, compute sizing).Implement CI/CD and DevOps practices (Git branching, automated deploys for ADF/Snowflake).Harden reliability (monitoring, alerting, retry logic, SLA tracking) and security/compliance (RBAC, secrets management, data governance, data lineage).Collaborate with stakeholders (product, analytics, and platform teams) to translate requirements into technical design and deliver incremental value.Must‑Have Qualifications4–7 years total experience in data engineering in large scale enterprise systemsSnowflake: Min 3 years of experience in Snowflake with exposure to warehouse configuration, schema design, performance tuning; stored procedures/tasks; loading strategies. Exposure to Snowflake Cortex AI.SQL/Python/PySpark: Design and implement scalable data processing solutions using SQL, Python, and distributed compute frameworks, including unit/integration tests.Azure & ADF: ADLS Gen2, ADF pipelines/activities, triggers, parameterization; monitoring & troubleshooting.Data modeling: Apply data modeling techniques, including medallion architecture (Bronze/Silver/Gold).API ingestion: designing resilient ingestion of REST/JSON, pagination, auth, rate‑limit handling.VectorDB & embeddings: Experience generating embeddings and building vector indices for retrieval‑augmented scenariosExposure to building knowledge graphs and Gremlin or Cypher graph query languages on CosmosDB/Neo4jVersion control & CI/CD: Git, pull requests, automated deployment pipelines.Maintain a results-oriented mindset with strong analytical and problem-solving skills.Good‑to‑HaveExperience in Healthcare IndustryPrior experience working on data migration projects.