US
0 suggestions are available, use up and down arrow to navigate them
What job do you want?

Apply to this job.

Think you're the perfect candidate?

Manager - Site Reliability Egineering

Michaels Irving, TX (Onsite) Full-Time
Support Center - Irving

We are seeking a highly skilled and experienced Manager - DevOps Engineering to lead our DevOps team in fostering a culture of collaboration, continuous improvement, and innovation across our development, operations, security, and QA teams. The ideal candidate will have a strong background in software development, systems administration, automation, and cloud infrastructure. This leadership role requires technical expertise, strong communication skills, and a passion for building scalable, reliable, and efficient systems.

Major Activities

  • Lead, mentor, and grow a team of SREs, fostering effective collaboration and high-performance culture with a focus on reliability, innovation, and continuous improvement.
  • Oversee the design, implementation, and monitoring of the reliability and performance of GCP-hosted services.
  • Define and enforce Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Service Level Agreements (SLAs) for all services. Establish and improve incident management processes, including post-mortem analysis and root cause investigation, to prevent recurrence of failures.
  • Lead efforts to automate operational processes, reducing manual work and improving system reliability. Ensure systems are architected with fault tolerance, scalability, and recovery in mind.
  • Oversee the deployment and maintenance of cloud infrastructure. Drive continuous improvement in cost efficiency by optimizing GCP resource usage and scaling strategies. Ensure infrastructure is built to handle scale, resiliency, and security requirements. Partner with engineering teams to align infrastructure needs with application development.
  • Ensure comprehensive monitoring of systems, applications, and infrastructure with appropriate alerting mechanisms. Define and track key metrics to evaluate the health and performance of GCP-hosted services. Implement dashboards and reporting mechanisms for stakeholders to track system performance and reliability.
  • Lead the effort to create clear, actionable incident reports and improve reporting processes for transparency. Work closely with development and product teams to ensure reliable application delivery and troubleshooting. Influence application architecture design decisions to ensure reliability and operational scalability. Advocate for a strong DevOps culture with an emphasis on automation and continuous integration/deployment (CI/CD).

Other duties as assigned

Minimum Education

  • Bachelor’s degree in computer science, Information Technology, or a related field (or equivalent work experience).

Minimum Type of Experience the Job Requires

  • 7+ years of experience in Site Reliability Engineering with familiarity in DevOps or a similar area, with at least 2-3 years in a managerial or leadership capacity.
  • Solid understanding of cloud platforms such as Google Cloud, Oracle cloud and experience with infrastructure-as-code (Terraform, CloudFormation).
  • Proven experience with monitoring, observability and logging platform/tools (Prometheus, Grafana, ELK stack, Datadog, GCP cloud observability etc.).
  • Good understanding of containerization and orchestration tools like Docker, Kubernetes, Helm, CI/CD pipeline, development and deployment strategies.
  • Experience in leading incident response and disaster recovery efforts. Expertise in managing large-scale distributed systems and microservices architecture.
  • Experience in retail industry with good understanding of ecommerce applications.

Other

  • Excellent leadership and team management skills with the ability to foster a collaborative, inclusive, and productive environment.
  • Strong problem-solving and troubleshooting skills.
  • In-depth knowledge of security best practices and vulnerability management.
  • Ability to balance technical depth with strategic decision-making to drive business outcomes.
  • Exceptional communication skills, both verbal and written.

Preferred Education

  • Master’s degree in computer science, Information Technology, or a related field (or equivalent work experience).

Preferred Type of Experience the Job Requires

  • Google Cloud Professional DevOps Engineer, Google Cloud Professional Cloud Architect
  • Kubernetes Certification (CKA/CKAD), Terraform Associate, or similar certifications.

Applicants in the U.S. must satisfy federal, state, and local legal requirements of the job.

At The Michaels Companies Inc, our purpose is to fuel the joy of creativity. As the leading creative destination in North America, we operate over 1,300 stores in 49 states and Canada and online at

and . The Michaels Companies, Inc. also owns Artistree, a manufacturer of custom and specialty framing merchandise, and , a dedicated handmade goods marketplace. Founded in 1973 and headquartered in Irving, Texas, Michaels is the best place for all things creative. For more information, please visit 

At Michaels, we prioritize the wellbeing of our teams by providing robust benefits for both full-time and part-time Team Members. Our benefits include health insurance (medical, dental, and vision), paid time off, tuition assistance, generous employee discounts, and much more. For more information, visit

.

Michaels is an Equal Opportunity Employer. We are here for all Team Members and all Makers to create, innovate and be better together.

Michaels is committed to the full inclusion of all qualified individuals. In keeping with this commitment, Michaels will assure that people with disabilities are provided reasonable accommodations. Accordingly, if a reasonable accommodation is required to fully participate in the job application or interview process, to perform the essential functions of the job, and/or to receive all other benefits and privileges of employment, please contact Customer Care at

1-800-642-4235
(1800-MICHAEL).

Federal FMLA Poster

Federal EPPAC Poster

Get job alerts by email. Join Our Talent Network!

Job Snapshot

Employee Type

Full-Time

Street Address

3939 West John Carpenter Freeway

Location

Irving, TX (Onsite)

Job Type

Management

Experience

Not Specified

Date Posted

02/05/2025

Apply to this job.

Think you're the perfect candidate?