Site Reliability Engineer (GKE + GCP)

Remote Full-time
Reponsibilities:• Work on a team of extremely talented platform engineers to help maintain and scale the current and future state services platform. • Help architect and develop the future state compute platform by leveraging industry best practices as well as embracing new technologies to support the future growth as a business• Help influence the product roadmaps of GCP (our primary cloud provider) to better suit our future state architecture• Work collaboratively with business and technical stakeholders to develop and architect enhancements to the compute platform capabilities that enable them to develop and iterate applications to power the business• Identify opportunities to introduce automation, improvements to avoid repetitive operational tasks (DRY)• Participate in the on-call rotation to ensure operational excellence and overall platform healthRequirements:• 5+ years of experience in platform engineering/SRE roles using an object oriented language (Python, Golang, etc)• Bachelor’s degree in Computer Science, Computer Engineering or equivalent combination of education and experience• Extensive experience working with Kubernetes in a public cloud (GKE, EKS, AKS, etc)• Experience working with Istio/Service Mesh• Experience working with IaC (Terraform, Pulumi, etc)• Experience working within a Public Cloud environment (GCP, AWS, Azure, etc)• Experience working with bolthires/CD tools such as Argo, Buildkite, TravisCI, Jenkins, Spinnaker, etc• Experience working with platform observability tools (Prometheus, Thanos, Grafana, Fluentbit, Cloud Monitoring, bolthires Cloud Logging, Datadog, Pagerduty, Cloudwatch, Kibana, Elastic Search, Splunk, VictorOps, etc)• Experience with Networking• Experience and desire to work in an agile environment• Analytical mindset and passion for solving business problems with technologyNice To Haves:• Experience working with Dev Testing tools and patterns such as Garden, Flagger, Canary Deployments, Blue/Green Testing, A/B Testing• Experience setting up and working with Kubernetes Admission Control (Kyverno, OPA, etc)• Experience working with workload scaling (HPA, VPA, Capacity Planning/Reservations, etc) Apply tot his job
Apply Now →

Similar Jobs

Site Reliability Engineer III, (IoT Observability)

Remote Full-time

Remote Site Reliability Engineer - Production Support Expert for Hybrid Work Environment in Atlanta, GA

Remote Full-time

SRE “ Site Reliability Engineer”

Remote Full-time

Site Reliability Engineer; DevOps; Remote

Remote Full-time

Senior Site Reliability Engineer Cloud Automation (Oracle Health Cloud, Remote US)

Remote Full-time

(1303) Senior Site Reliability Engineer

Remote Full-time

[Remote] Site Reliability Engineer (Customer-Facing)

Remote Full-time

Senior Site Reliability Engineer | G Federal Reserve Bank of Chicago | Remote (United States)

Remote Full-time

Site Reliability Engineer

Remote Full-time

Site Reliability Engineer - SRE

Remote Full-time

[Remote] ML Engineer (LLM / Google Cloud)

Remote Full-time

Retirement Plan Advisor (San Francisco area)

Remote Full-time

Remote Data Entry Clerk jobs in Dania Beach, Florida – Full‑Time Typist & Data Processor Role with $38‑$45k Salary, Entry‑Level Administrative Clerk, 40 hrs/week, Excel & OCR Experience Preferred

Remote Full-time

Experienced Recruiting Coordinator - United States (Remote) - Talent Acquisition Support Specialist

Remote Full-time

Pricing Analyst

Remote Full-time

Web Optimization Manager (CRO)

Remote Full-time

[Remote] Senior Project Manager (Utility Scale Solar)

Remote Full-time

Property Management Analyst

Remote Full-time

FCC Portfolio Marketing Consultant (Remote US)

Remote Full-time

Sales Operations Analyst III

Remote Full-time
← Back to Home