RL Environments; MLE; SWE; LLM Tasks; Difficulty Distribution; Remote Contractor; PST Overlap (≥4h); Advanced English (C1/C2); Fast Edits (24h); 1 Task / 3–5h; Targets Specific Model; Homework Assignmen
We’re hiring RL Environments Engineers to design and build MLE/SWE environments that deliver high-quality, diverse tasks with minimal supervision. You’ll target specific language models and meet a defined difficulty distribution, delivering about one task every 3–5 hours. Remote contractor role with ≥4 hours overlap to PST and advanced English (C1/C2) required.
Company XOR.AI/Preference Model
Preference Model We Scale AI Agent Infrastructure. We focus on building rigorous environments and evaluations for language models. The company has recently closed a large funding round and is running pilots with leading AI labs.
Strong Python skills suitable for building RL environments and tasks.
Experience designing environments/tasks for RL or evaluations.
Ability to meet the throughput expectation (1 task / 3–5h) and respond within 24 hours to feedback.
Comfortable onboarding quickly and working independently.
Ability to work with at least 4 hours overlap to PST business hours.
Advanced English (C1/C2) for specifications, reviews, and feedback.
Remote, independent contractor engagement.
Deliverables-driven; begin shipping on day one.
Log InOnly registered users can open employer contacts.
Our website uses cookies, including web analytics services. By using the website, you consent to the processing of personal data using cookies. You can find out more about the processing of personal data in the Privacy policy