# Scientific AI Evaluation & Computational Problem Designer

**Company:** [Weekday AI](http://jobs.workable.com/companies/pxG9rDgnvZm2c86JUchT1j.md)
**Location:** Remote
**Workplace:** remote
**Employment type:** Part-time
**Department:** AI Training

[Apply for this job](http://jobs.workable.com/view/e1302ebc-0557-4bea-aca4-1c73bae11b89)

## Description

**This role is for one of our clients**

**Compensation: $45-$100 per hour  
**

We are building a large-scale evaluation benchmark to test advanced AI reasoning across scientific and engineering domains. This role focuses on designing rigorous, research-grade computational problems that assess how effectively AI systems can leverage real scientific software tools to solve complex challenges.

Unlike traditional annotation roles, this position requires creating original, graduate-level problems rooted in real-world scientific workflows. You will iteratively refine these problems through calibration against state-of-the-art AI models, ensuring the right balance of difficulty, depth, and reasoning complexity.

## Requirements

**What You’ll Do**

-   Design advanced computational problems requiring the use of domain-specific scientific software
-   Create tasks that test both precise execution (multi-step workflows, simulations) and strategic reasoning (experiment design, inference from partial data)
-   Develop problem setups, solution pathways, and validation mechanisms
-   Calibrate and refine tasks based on model performance to achieve target difficulty levels
-   Ensure problems emphasize reasoning strategy over brute-force computation

**Domains & Tools of Interest**  
We are particularly seeking candidates with hands-on experience in:

-   **Bioinformatics & Single-Cell Genomics:** scanpy, scvelo, squidpy, gudhi (RNA-seq, trajectory inference, spatial transcriptomics)
-   **Computational Chemistry:** PySCF (HF, DFT, TDDFT, CASSCF, post-HF methods)
-   **Particle & Nuclear Physics:** scikit-hep, Monte Carlo simulations, collider data analysis
-   **Electrical Engineering:** scikit-rf, ngspice (RF systems, circuit simulation)
-   **Astrophysics & Cosmology:** astropy (cosmological modeling, survey analysis)
-   **Structural & Mechanical Engineering:** scikit-fem (finite element analysis, elasticity, beam theory)
-   **Seismology & Geophysics:** ObsPy, SPECFEM (waveform analysis, inversion, tomography)
-   **Pharmacokinetics & Systems Biology:** libRoadRunner, Tellurium, SBML-based tools

Experience with other specialized tools in related domains is also welcomed.

**What Makes You a Strong Fit**

-   Graduate-level expertise (MS or PhD preferred) in a relevant STEM field
-   Hands-on experience using scientific software libraries for real research problems
-   Strong Python programming skills, including building computational workflows and validators
-   Ability to design challenging problems that require deep reasoning rather than surface-level solutions
-   Familiarity with edge cases, limitations, and practical challenges of scientific tools

**Requirements**

-   Demonstrated proficiency with at least one relevant scientific library (via research, open-source work, or industry experience)
-   Ability to work independently and iterate based on feedback
-   Comfort working in Linux/terminal environments and remote compute setups
-   Availability of at least 15–20 hours per week

**Nice to Have**

-   Experience across multiple domains or tools
-   Background in evaluation frameworks or benchmarking
-   Experience in teaching, pedagogy, or problem-set design
-   Familiarity with reproducible research practices and containerized environments

**Engagement Details**

-   Independent contractor role
-   Fully remote with flexible scheduling
-   Project scope may evolve based on performance and research needs

**Compensation & Payments**

-   Competitive compensation based on expertise and domain specialization
-   Weekly payments via supported global payment platforms

**Additional Information**

-   Work must not involve sharing confidential or proprietary information from any current or past employer or institution
-   Projects may be extended, modified, or concluded based on performance and business requirements
-   This opportunity does not currently support certain work authorization categories
