# Senior/Staff/Principal SWE- Observability Engineering

**Company:** [AppGate Cybersecurity, Inc.](http://jobs.workable.com/companies/azBkdkXui3RVLSUCDJYAAh.md)
**Location:** Remote
**Workplace:** remote
**Employment type:** Full-time

[Apply for this job](http://jobs.workable.com/view/79593878-d8ee-49f4-af7e-bd461ac7e045)

## Description

**About AppGate**

AppGate secures and protects an organization's most valuable assets with its high performance Zero Trust Network Access (ZTNA) solution. AppGate is the only direct-routed ZTNA solution built for peak performance, superior protection and seamless interoperability. AppGate safeguards Fortune 500 enterprises worldwide. Learn more at appgate.com. 

**About the Role**

We’re looking for an **Observability Engineer** (Senior/Staff/Principal level) who has shipped distributed tracing systems, designed high-cardinality pipelines, and knows OpenTelemetry inside and out. You will own the end-to-end design and implementation of the AppGate observability fabric — from telemetry SDKs in our clients and gateways, to the LogForwarder pipeline, to customer-side integrations.

You’ll make the foundational technical decisions — transport protocols, sampling strategies, schema design, correlation models — that determine whether our platform scales gracefully to hundreds of millions of events per day. This is a builder’s role with a strategist’s reach.

**Key Responsibilities**

Your engineering work will directly enable next-generation capabilities, including:

•       **OpenTelemetry-Native Telemetry Fabric:** Logs and distributed traces from clients, controllers, gateways, and connectors — all correlated by session, user, device, and trace ID across the full ZTNA flow.

•       **High-Cardinality Data Pipeline:** An OTLP-based ingestion and routing layer engineered for 100M+ events per day, with attribute filtering, redaction, and tail-sampling.

•       **End-to-End Distributed Tracing:** Span hierarchies decomposing login and session establishment across posture checks, policy decisions, TLS handshakes, and entitlement resolution — turning hours of triage into seconds.

•       **On-Demand Packet Capture:** Admin-triggered PCAP coordinated across client and gateway, with the workflow fully observable through OTel logs and traces.

•       **AI-Ready Foundation:** Structured, semantically rich telemetry that future LLM-based incident analysis agents can reason over. The schema you design today is the substrate for Phase 3.

•       **Architect the Observability Platform:** Define telemetry schema, correlation model, transport, and sampling strategies spanning client devices, controllers, and gateways.

•       **Build the Telemetry SDKs and LogForwarder:** Instrument AppGate components with OpenTelemetry and implement the enrichment, redaction, batching, and tail-sampling pipeline that scales horizontally under load.

•       **Validate at Customer Scale:** Test in lab environments matching our largest deployments — hundreds of sites, tens of thousands of concurrent sessions — and hunt down cardinality explosions and pipeline backpressure before customers see them.

•       **Drive Integration Standards:** Own the OTLP, Prometheus, and JSON-log compatibility surface and validate ingestion into Datadog, Splunk, Nexthink, and Elastic.

•       **Raise the Engineering Bar:** Establish patterns and review practices the Data + AI team builds on. Mentor engineers and grow the observability discipline inside AppGate.

•       **Collaborate Cross-Functionally:** Work directly with product, R&D, and marquee customers in defense and critical infrastructure to shape requirements and deliver outcomes that matter.

**Required Qualifications**

•       **8+ years of engineering experience** with at least 4 years dedicated to observability, telemetry, or large-scale data infrastructure (Datadog, Splunk, Elastic, Honeycomb, New Relic, Grafana Labs, or equivalent).

•       **Deep OpenTelemetry expertise:** OTLP, the OTel Collector, semantic conventions, context propagation, and head/tail sampling — you can debate the trade-offs in your sleep.

•       **Distributed tracing in production:** You’ve designed or significantly contributed to a tracing system handling real customer traffic, not just a side project.

•       **High-throughput pipeline experience:** Hands-on with systems ingesting 100M+ events per day, including back-pressure handling, batching, and storage trade-offs.

•       **Strong systems programming:** Production Go and/or Rust preferred. Comfort across the stack, from agent code to backend services.

•       **Networking and security fluency:** Comfortable with TLS, DNS, TCP, and identity protocols. Prior ZTNA, SASE, or SD-WAN experience is a strong plus.

•       **Mindset:** Pragmatic, opinionated, and impact driven. You know when to prototype and when to ship.

**Our Observability Vision**

AppGate secures defense agencies, federal governments, and Fortune 100 enterprises. When a connection traverses our ZTNA fabric — across clients, gateways, controllers, and protected resources — every hop carries real consequences for national security and business continuity. Yet when something breaks, the answer to _“Why can’t I reach this resource?”_ is still buried in fragmented logs and tribal knowledge. That ends now.

We are building **Observability AI** — a purpose-built observability platform for the Zero Trust era. It emits high-fidelity, correlated telemetry across every AppGate component, is OpenTelemetry-native, engineered for 100M+ events per day, and designed to stream into Datadog, Splunk, Nexthink, Elastic, or any OTLP-compatible backend. The roadmap runs from a raw data-feed MVP, through native analytics and root-cause dashboards, to **AI-driven incident analysis** — LLM agents that read traces and explain failures in AppGate terms — and ultimately to autonomous remediation. This is the nervous system for networks that protect nations.

This is your chance to build the **observability platform for networks that protect nations.**

If you’ve shipped observability at scale and want to apply that craft where the stakes are highest, **we want to hear from you.**

_AppGate is An Equal Opportunity/Affirmative Action Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability or veteran status, age or any other federally protected class. In furtherance of AppGate's policy regarding affirmative action and equal employment opportunity, AppGate has developed a written affirmative action program. This program is available for review upon request by any applicant or employee during normal business hours by contacting the company's EEO Coordinator._
