Job title: Site Reliability Engineer
Job type: Permanent
Emp type: Full-time
Salary type: Annual
Salary: negotiable
Job published: 02-04-2025
Job ID: 32922
Contact name: Scott Daly
Phone number: +441782376766
Contact email: scott.daly@woodfordgray.com

Job Description

About the job Site Reliability Engineers x2 Manhattan

  • On-site
  • Full-time
  • $180K/yr - $220K/yr

 

Overview:

Woodford Gray has partnered with a Series B start up, a data security company focused on transforming the future of data protection. We are looking for a highly skilled Infrastructure Engineer to help automate and manage large-scale, distributed, fault-tolerant systems across multiple cloud environments.



Who You Are:

  • You have a strong understanding of Linux internals and excel at troubleshooting and resolving complex technical issues.
  • You possess advanced expertise in Kubernetes, Docker, and Terraform.
  • You have 4+ years of experience managing and deploying scalable infrastructure across AWS, GCP, and other cloud platforms.
  • You are skilled at automating single-tenant deployments across diverse cloud environments and compute configurations.
  • You are experienced with infrastructure-as-code, particularly using Terraform and similar tools.
  • You have a proven track record in monitoring and improving critical service metrics, including availability, scalability, and overall system health for customer-facing platforms.
  • You are adept at streamlining infrastructure to minimize operational overhead and optimize resource utilization.



Responsibilities:

  • Address user emergencies, respond to platform alerts, and manage support requests, ensuring timely and effective resolutions.
  • Automate and oversee the deployment of single-tenant environments across AWS, GCP, and additional cloud platforms.
  • Assist in preparing for new customer deployments, providing guidance on architecture and resource optimization.
  • Continuously monitor and optimize service metrics, ensuring the highest levels of availability, scalability, and minimal latency for customer-facing systems.
  • Simplify infrastructure configurations to reduce operational complexity and improve efficiency.