US
0 suggestions are available, use up and down arrow to navigate them
What job do you want?

Apply to this job.

Think you're the perfect candidate?

Site Reliability Engineer II

Yahoo Inc USA (Remote) Full-Time
It takes powerful technology to connect our brands and partners with an audience of hundreds of millions of people. Whether you’re looking to write mobile app code, engineer the servers behind our massive ad tech stacks, or develop algorithms to help us process trillions of data points a day, what you do here will have a huge impact on our business—and the world.

The SRE Observability team builds and manages Infrastructure as Code (IaC) for both on-prem and cloud applications, driving automation and scalability. We enhance tool integrations to strengthen application reliability and oversee CI/CD pipelines to streamline the deployment of high-availability applications and workflows. Our team optimizes observability, alerting, incident response, and on-call solutions, collaborating closely with SaaS vendors and Yahoo engineering teams to ensure reliability and efficiency at scale.

About You: 

You are a curious problem solver who thrives in fast-paced environments, tackling complex challenges at a global scale. Passionate about automation, reliability, and efficiency, you think beyond the obvious to build data-driven solutions that operate at massive scale. You adapt to change, drive improvements, and collaborate to optimize systems that impact engineering teams worldwide. If this excites you, join us!

Your Day: 

As an SRE in Yahoo’s Observability team, you will specialize in managing O11y, Incident & Oncall solutions ensuring high availability, reliability & scalability.  You will support both Opensource and Saas solutions that power Yahoo’s event response life cycle. We focus on enhancing & automating  workflows that empower DevOps teams across Yahoo. You will solve problems of various complexity both individually and in a team environment. 

Key Responsibilities

  • Maintain & Improve comprehensive monitoring, alerting, and logging systems. (ie. OpenTSDB, Grafana, Splunk, Chronosphere, Big Panda, Rootly) 

  • Enhance o11y guides & documentation to support ongoing service management operations.  

  • Ensure 24/7/365 availability, scalability, and incident response for critical applications.

  • Participate in a global on-call rotation. Troubleshoot, resolve, and document production issues, escalating when necessary.

  • Monitor and report performance, availability, and SLA metrics.

  • Work with development teams to enhance, document, and improve system operability.

  • Develop, configure, and manage Terraform-based Infrastructure as Code (IaC) configurations to automate provisioning, scaling, and management of cloud environments.

  • Build CICD pipelines and iterate on existing chef/ansible templates for application deployments used for OS builds, configurations, or upgrades.

  • Modernize infrastructure by performing OS upgrades & migrating services to Kubernetes

  • Oversee Change management coordination with key-stakeholders

  • Develop and support automation scripts and tools for operational efficiency, leveraging AWS and GCP SDKs and APIs.

  • Provide stakeholders with progress updates on shared initiatives (Email, Jira, Slack, Tickets, GIT, Meetings)  

  • Manage situations of moderate complexity and make timely decisions to ensure smooth operations

  • Develop business operations workflows for large applications to meet business needs.

Minimum Job Qualifications 

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or 5+ years of experience in DevOps, Site Reliability Engineering (SRE), or Infrastructure Engineering roles.

  • 2+ years of programming experience in Bash, Python, Java or Go. 

  • In-depth knowledge of Linux distributions like RedHat and CentOS; Linux certifications (RHCT, RHCE, LPIC) are a plus.

  • Hands-on experience with AWS core services such as EC2, S3, RDS, EKS, Lambda, and networking services like VPC, Route 53, API GW, and Transit Gateway

  • Understanding of containerization and orchestration technologies, especially Kubernetes 

  • Strong understanding of networking concepts (DNS, TCP/IP, HTTP/S, Load Balancing) and cloud-native networking in AWS.

  • Experience with CI/CD tools such as GitHub Actions, Jenkins, ArgoCD,  Screwdriver

  • An understanding of IaC concepts, specifically using Terraform 

  • Ability to troubleshoot & resolve hardware, network and software problems 

  • Experience with OSS and / or commercial observability tools like Grafana, NewRelic, DataDog, Splunk, Chronosphere, AWS or GCP native telemetry tools

  • Strong skill set integrating diverse API and Web Services

  • Strong troubleshooting skills with a focus on automation, scalability, and resilience.

  • Excellent communication and interpersonal skills.

  • Strong desire to learn new technologies and systems as part of daily work.

Preferred Job Qualifications 

  • Knowledge and operational experience running large-scale global distributed systems

  • Expert using Terraform as IaC 

  • Strong expertise in Splunk Cloud & Open Telemetry 

  • Experience managing multi-region, multi-AZ cloud deployments with a focus on disaster recovery and fault tolerance

  • Proficient in Slack, Jira & Confluence

The material job duties and responsibilities of this role include those listed above as well as adhering to Yahoo policies; exercising sound judgment; working effectively, safely and inclusively with others; exhibiting trustworthiness and meeting expectations; and safeguarding business operations and brand integrity.

At Yahoo, we offer flexible hybrid work options that our employees love! While most roles don’t require regular office attendance, you may occasionally be asked to attend in-person events or team sessions. You’ll always get notice to make arrangements. Your recruiter will let you know if a specific job requires regular attendance at a Yahoo office or facility. If you have any questions about how this applies to the role, just ask the recruiter!

Yahoo is proud to be an equal opportunity workplace. All qualified applicants will receive consideration for employment without regard to, and will not be discriminated against based on age, race, gender, color, religion, national origin, sexual orientation, gender identity, veteran status, disability or any other protected category. Yahoo will consider for employment qualified applicants with criminal histories in a manner consistent with applicable law. Yahoo is dedicated to providing an accessible environment for all candidates during the application process and for employees during their employment. If you need accessibility assistance and/or a reasonable accommodation due to a disability, please submit a request via the Accommodation Request Form (

) or call
+1.866.772.3182
. Requests and calls received for non-disability related issues, such as following up on an application, will not receive a response.

We believe that a diverse and inclusive workplace strengthens Yahoo and deepens our relationships. When you support everyone to be their best selves, they spark discovery, innovation and creativity. Among other efforts, our 11 employee resource groups (ERGs) enhance a culture of belonging with programs, events and fellowship that help educate, support and create a workplace where all feel welcome. Check out our diversity and inclusion (

) page to learn more.

The compensation for this position ranges from $96,000.00 - $200,000.00/yr and will vary depending on factors such as your location, skills and experience. The compensation package may also include incentive compensation opportunities in the form of discretionary annual bonus or commissions, in addition to equity incentives. Our comprehensive benefits include healthcare, a great 401k, backup childcare, education stipends and much (much) more.

Currently work for Yahoo? Please apply on our internal career site.

Get job alerts by email. Join Our Talent Network!

Job Snapshot

Employee Type

Full-Time

Location

USA (Remote)

Job Type

Other

Experience

Not Specified

Date Posted

03/11/2025

Apply to this job.

Think you're the perfect candidate?