Senior Site Reliability Engineer (SRE)

The team of experts providing analytical services to healthcare clients is looking for a great, long-term Senior Site Reliability Engineer (SRE).

You will join an international team of first-class professionals who are passionate about creating products that improve the quality of medical services. 

The company offers exposure to a variety of industries and technologies, room to grow as a professional, time in projects to learn new skills and an opportunity to work with phenomenal coworkers, some of the best people on the planet. 

The Senior Site Reliability Engineer (SRE) will be responsible for enhancing the reliability, scalability, and security of microservices within the healthcare cloud platform. This role involves designing, implementing, and managing robust, scalable, and secure solutions. The ideal candidate has experience with GCP, infrastructure as code (IaC), and SRE practices.

Responsibilities:

Proactively monitor system performance, anticipating potential issues, and implementing solutions;
Collaborate with development and operations teams to ensure alignment of goals and efficient workflows;
Create and maintain comprehensive solutions documentation, including C4 models, deployment diagrams, network diagrams, and component views;
Implement and advocate for SRE best practices, including monitoring, alerting, incident response, and capacity planning;
Collaborate with development teams to integrate the latest technology advancements, ensuring scalability and agility;
Navigate complex corporate environments, aligning technical operations with broader business goals.

Fundamentals:

3+ years of experience in a Cloud-Native Site Reliability Engineering;
Expertise using Terraform;
Proficient in GCP, with demonstrated expertise in cloud infrastructure and services;
Advanced knowledge in Kubernetes, including cluster management and security policies;
Familiarity with microservices architecture in a cloud-native environment;
Solid understanding of network architecture and security protocols;
Strong analytical and strategic thinking skills;
Experience collaborating effectively across organizational boundaries, building
relationships, and importing and exporting ideas to achieve broad organizational goals;
Excellent problem-solving, negotiation, and organizational skills;
Outstanding written and verbal communication skills that drive executional impact at scale.

Pros:

Understanding of HIPAA Compliance;
Experience with Google Cloud's Operations Suite;
Experience with Go, Python, PHP.

Technical Stack:

Google Cloud's Operations Suite
GCP Cloud Build
Prometheus, Grafana
OLTP: Google Cloud Spanner
Google Kubernetes Engine
Terraform
Go, Python, PHP

Benefits:

Flexible working hours;
Remote work;
Interesting projects to work on;
Exposure to a variety of industries and technologies.

Senior Site Reliability Engineer (SRE)

Senior Site Reliability Engineer (SRE)

Already working at Top Remote Talent?