Reinforcement Learning Environments Engineers

$18,00032,000/month
Remote
Full-time

RL Environments; MLE; SWE; LLM Tasks; Difficulty Distribution; Remote Contractor; PST Overlap (≥4h); Advanced English (C1/C2); Fast Edits (24h); 1 Task / 3–5h; Targets Specific Model; Homework Assignmen

Brief description of the vacancy

We’re hiring RL Environments Engineers to design and build MLE/SWE environments that deliver high-quality, diverse tasks with minimal supervision. You’ll target specific language models and meet a defined difficulty distribution, delivering about one task every 3–5 hours. Remote contractor role with ≥4 hours overlap to PST and advanced English (C1/C2) required.

About the company

Company XOR.AI/Preference Model

Preference Model We Scale AI Agent Infrastructure. We focus on building rigorous environments and evaluations for language models. The company has recently closed a large funding round and is running pilots with leading AI labs.

Responsibilities

  • Design and build MLE/SWE environments and diverse tasks.
  • Target a specified language model and satisfy the required difficulty distribution.
  • Deliver ~1 task per 3–5 hours once onboarded.
  • Be responsive and edit tasks within 24 hours based on customer feedback.
  • Onboard quickly and start delivering on day one; operate with minimal supervision.

Requirements

Strong Python skills suitable for building RL environments and tasks.

Experience designing environments/tasks for RL or evaluations.

Ability to meet the throughput expectation (1 task / 3–5h) and respond within 24 hours to feedback.

Comfortable onboarding quickly and working independently.

Ability to work with at least 4 hours overlap to PST business hours.

Advanced English (C1/C2) for specifications, reviews, and feedback.

Working conditions

Remote, independent contractor engagement.

Deliverables-driven; begin shipping on day one.

Contacts

Log InOnly registered users can open employer contacts.

Our website uses cookies, including web analytics services. By using the website, you consent to the processing of personal data using cookies. You can find out more about the processing of personal data in the Privacy policy