# Site Leader

**Company:** [Weekday AI](http://jobs.workable.com/companies/pxG9rDgnvZm2c86JUchT1j.md)
**Location:** Remote
**Workplace:** remote
**Employment type:** Full-time
**Department:** Weekday's Client via platform

[Apply for this job](http://jobs.workable.com/view/d429d3c0-79f2-47e7-87df-7e3fe49d2ca1)

## Description

**This role is for one of the Weekday's clients  
  
**Min Experience: 10 years

Location: Poland, Remote (poland)

JobType: full-time  
  
We are seeking a highly experienced and driven Site Leader with a strong background in Site Reliability Engineering (SRE) and Infrastructure to lead and scale our engineering operations. This role is ideal for a seasoned Engineering Manager who thrives at the intersection of leadership, system reliability, and large-scale infrastructure management. As a Site Leader, you will be responsible for building resilient systems, managing high-performing teams, and ensuring the availability, scalability, and performance of mission-critical platforms.

## Requirements

**Key Responsibilities**

-   Lead and manage SRE and Infrastructure teams, driving operational excellence and fostering a culture of reliability and accountability.
-   Define and execute the overall infrastructure and reliability strategy aligned with business goals.
-   Oversee the design, deployment, and maintenance of scalable, highly available, and secure systems.
-   Establish and monitor SLAs, SLOs, and SLIs, ensuring consistent service performance and uptime.
-   Drive incident management processes, including root cause analysis, postmortems, and continuous improvement initiatives.
-   Collaborate with product and engineering teams to embed reliability and scalability into the development lifecycle.
-   Champion automation, observability, and proactive monitoring to minimize downtime and improve system health.
-   Manage infrastructure costs, capacity planning, and resource optimization.
-   Mentor and develop engineering managers and senior engineers, building a strong leadership pipeline.
-   Ensure adherence to best practices in cloud infrastructure, DevOps, and security compliance.

**Required Skills & Qualifications**

-   10–15 years of experience in software engineering, infrastructure, or SRE, with at least 3–5 years in an Engineering Manager or leadership role.
-   Proven expertise in Site Reliability Engineering (SRE) principles, including reliability, scalability, and fault tolerance.
-   Strong experience with cloud platforms (such as AWS, GCP, or Azure) and modern infrastructure architectures.
-   Deep understanding of infrastructure as code (Terraform, CloudFormation), CI/CD pipelines, and containerization technologies (Docker, Kubernetes).
-   Demonstrated ability to lead and scale distributed engineering teams.
-   Strong problem-solving skills with a focus on system-level thinking and root cause analysis.
-   Experience with monitoring and observability tools such as Prometheus, Grafana, ELK stack, or similar.
-   Excellent stakeholder management and communication skills, with the ability to influence cross-functional teams.

**Preferred Qualifications**

-   Experience managing large-scale, high-traffic production systems.
-   Background in DevOps transformation and cloud-native architecture.
-   Familiarity with security best practices and compliance frameworks.