# Senior AI Engineer

**Company:** [Lucida AI](http://jobs.workable.com/companies/v638cv3C4ZZ1pNkSQNkv4E.md)
**Location:** Maslak, Turkey
**Workplace:** on site
**Employment type:** Full-time
**Department:** Technology

[Apply for this job](http://jobs.workable.com/view/ec780251-364b-4eb3-a0d7-2cdcc712afa4)

## Description

### Lucida is teaching the world to speak.

Two billion people are trying to learn a language. Almost all of them are stuck ; not because they lack motivation, but because the only thing that actually works (talking to a human tutor) is too expensive, too inconvenient, or too embarrassing.

We're building the alternative: a voice-first AI tutor you can actually have a conversation with, anytime, in your pocket. Real-time. Sub-second. Feels-like-a-person. Already serving a million learners.

We're well-funded, seed-stage, and we're hiring the engineer who'll build the backbone behind that product.

### The role

You'll own a meaningful surface of our backend ; the systems that turn audio, models, prompts, and user state into a working tutor at scale. Day-to-day, you'll:

-   Design and operate the **real-time conversational pipeline** ; streaming services and WebSocket interfaces that keep latency budgets honest at the scale of a million users
-   Build and harden the **LLM orchestration layer** ; prompt design as code, structured outputs, streaming, retries, fallbacks, cost control across multiple providers
-   Treat **prompts as engineering artifacts**: versioned, evaluated, regression-tested. Vibes are not a methodology.
-   Take **open-source models** (LLM, ASR, TTS, avatar) from a paper or HF repo and put them on our **GPUs** ; benchmark, optimize, serve, monitor
-   **Fine-tune and train** our own models on top of open-source bases ; curate datasets, run training jobs, evaluate against production criteria, and ship the result
-   Design **event-driven media flows** ; webhooks, post-session processing, recording and export pipelines
-   Own third-party integrations end-to-end ; contracts, retries, observability, the boring-important stuff
-   Make architecture decisions _with_ the founders, not after them

### What we're looking for

-   **5+ years** writing production Python you're not embarrassed by ; typed, tested, readable
-   Deep fluency in **asyncio** and concurrent/streaming code
-   Strong command of HTTP, WebSockets, and event-driven systems
-   Hands-on experience integrating with **LLM APIs in production** ; streaming, tool use, structured outputs, and the operational realities (rate limits, retries, cost control)
-   A real sense of **prompt engineering as engineering** ; you've shipped prompts that survived contact with users, iterated on them with data, and didn't just "feel good in the playground"
-   A real **fine-tuning / training track record** ; you've taken an open-source model, prepared the data, run the training, evaluated it honestly, and shipped the result to users. Not a notebook tutorial. A model that moved a metric.
-   Experience deploying and serving **your own models on GPUs** ; quantization, batching, KV-cache, latency/throughput tradeoffs
-   A debugging instinct for distributed systems at scale: traces, profiling, backpressure, capacity planning
-   Comfort with Postgres, Redis, and a queue/broker layer
-   Pragmatism ; you ship, you measure, you iterate. You don't over-engineer, and you don't under-test.

### Nice to have

-   Real-time media systems (WebRTC, SFU, streaming pipelines)
-   Audio or speech model deployment and fine-tuning in production
-   Distillation, synthetic data generation, or RLHF/DPO-style alignment work
-   Multi-region or multi-cloud infrastructure
-   Cost optimization at scale, token economics, GPU utilization, caching strategies
-   Open-source contributions
