# SRE/Devops Engineer- Sunnyvale, CA, the US

**Company:** [Kody](http://jobs.workable.com/companies/aDWSRg9smKoHu3GYFnXU9H.md)
**Location:** Sunnyvale, United States
**Workplace:** on site
**Employment type:** Full-time
**Department:** Technology

[Apply for this job](http://jobs.workable.com/view/e3a2d058-335f-4bed-b6cf-75c95bbc107f)

## Description

### **About the Role**

We are seeking a high-caliber **Senior Site Reliability Engineer (SRE)** based in California to ensure the scalability, reliability, and runtime efficiency of our next-generation platform. In this role, you will bridge the gap between development and operations, working closely with our global engineering teams.

We are looking for a unique engineering mindset: someone who brings a positive, collaborative energy to the daily grind, but can instantly pivot into a hyper-focused, high-ownership responder when an incident strikes.

### **Key Responsibilities**

-   **Production Reliability & Guardrails:** Partner with the Platform Engineering team to implement reliability guardrails, ensuring applications running on **AWS** meet strict uptime and SLA requirements.
-   **CI/CD & Repository Management:** Own the deployment pipelines and code management practices extensively via **GitHub**.
-   **Incident Management:** Lead rapid-response troubleshooting during production incidents; conduct thorough blameless post-mortems to continuously harden our systems.
-   **Observability & Performance:** Implement advanced monitoring, logging, and alerting systems to proactively detect and mitigate system anomalies.
-   **Cross-Border Collaboration:** Act as a key technical bridge between our US operations and international engineering hubs, leveraging bilingual communication to streamline complex technical alignment.

## Requirements

### **1\. Technical Focus**

-   **Ecosystem Expertise (Must-Haves):** Deep, practical experience managing application deployment and runtime environments on **AWS**, alongside master-level knowledge of advanced Git workflows and actions on **GitHub**.
-   **Core Toolkit:** Strong proficiency in monitoring tools, log management, and scripting for quick triaging and troubleshooting.

### **2\. Soft Skills & Characteristics**

-   **Ownership & Transparency:** You are radically **open**, highly responsive, and communicative. You don't just clear tickets; you own the production environment's health end-to-end.
-   **Pressure-Resistance:** High psychological resilience. You maintain a happy, positive attitude during smooth operations, yet feel a healthy, driving sense of urgency and laser-focus during high-stakes incidents.
-   **Bilingual Capability:** Absolute fluency in **Mandarin and English** (verbal and written) is mandatory for effective technical alignment across our global teams.

## Benefits

\- Competitive base salary + equity packages aligned with California market standards.
