# AI Infrastructure Engineer (Storage)

**Company:** [CommonAI C.I.C.](http://jobs.workable.com/companies/xeWV7fdrawT4BSozu5vzff.md)
**Location:** Cambridge, United Kingdom
**Workplace:** on site
**Employment type:** Full-time

[Apply for this job](http://jobs.workable.com/view/eb5d5809-3b96-4c7a-8453-2aa8d4e88810)

## Description

CommonAI CIC is a non-profit membership organisation, founded on a belief in collaborative engineering for the safe and responsible development of foundational AI technologies. A place where AI startups, enterprises large and small, public sector bodies and academia can share resources and knowledge, to codevelop and grow businesses, fast.

We support technology-focused start ups, each with unique data management challenges, and are seeking an experienced Infrastructure Engineer to help them design, deploy and maintain high-performance storage systems for their AI and data-driven workloads. The successful candidate will combine deep experience architecting and managing distributed, cloud, and tiered storage solutions with strong Linux and automation skills.

In this role you will:

-   Design, implement, and maintain storage platforms that support large-scale AI and data pipelines
-   Manage distributed storage systems such as Ceph, Lustre, or BeeGFS.
-   Oversee tiered storage architectures, optimising data movement across high-performance, object, and archival tiers.
-   Ensure data integrity, availability, and security across on-premises and cloud environments.
-   Develop automation and monitoring tools using Bash, Python, or similar scripting languages.
-   Manage and secure container images and related storage used for AI and ML workloads.
-   Integrate storage systems with public cloud services (AWS, Azure, GCP) and hybrid environments.
-   Troubleshoot complex storage and data flow issues, collaborating closely with AI platform and infrastructure teams.
-   Contribute to ongoing architecture improvements, performance tuning, and capacity planning.

## Requirements

To be considered candidates should meet most of the following requirements:

-   Strong Linux system administration background.
-   Proven experience installing, configuring, and maintaining Ceph clusters or similar technologies in a production environment.
-   Familiarity with distributed filesystems (e.g., Lustre, BeeGFS) and cloud-based storage services (e.g. EC2).
-   Experience with tiered storage management and lifecycle data policies.
-   Scripting and automation proficiency (e.g. Bash, Python, Terraform/OpenTofu, Ansible).
-   Understanding of data security best practices and compliance considerations.
-   Experience working with container technologies (e.g. Docker, Kubernetes) and image storage registries.
-   Strong analytical, troubleshooting, communication and documentation skills.

We also value:

-   Knowledge of GPU compute environments or AI training infrastructure.
-   Experience with monitoring and observability tools (Prometheus, Grafana, etc.).
-   Contributions to open-source storage, data management, or infrastructure projects.
-   Familiarity with object storage systems (S3, RADOS Gateway, MinIO, etc.).

## Benefits

-   A collaborative and supportive work environment.
-   The opportunity to have a high impact in a growing organisation.
-   Competitive salary package and pension.
-   Professional development opportunities.
-   Networking opportunities with influential people from across the tech sector and academia.
-   A vibrant office environment located a few minutes walk away from Cambridge train station.

**CommonAI CIC is an equal opportunity employer and is committed to creating an inclusive and diverse workplace.**
