Reinforcement Learning Environments Engineers

Brief description of the vacancy

We’re hiring RL Environments Engineers to design and build MLE/SWE environments that deliver high-quality, diverse tasks with minimal supervision. You’ll target specific language models and meet a defined difficulty distribution, delivering about one task every 3–5 hours. Remote contractor role with ≥4 hours overlap to PST and advanced English (C1/C2) required.

About the company

Company XOR.AI/Preference Model

Preference Model We Scale AI Agent Infrastructure. We focus on building rigorous environments and evaluations for language models. The company has recently closed a large funding round and is running pilots with leading AI labs.

Responsibilities

Design and build MLE/SWE environments and diverse tasks.
Target a specified language model and satisfy the required difficulty distribution.
Deliver ~1 task per 3–5 hours once onboarded.
Be responsive and edit tasks within 24 hours based on customer feedback.
Onboard quickly and start delivering on day one; operate with minimal supervision.

Requirements

Strong Python skills suitable for building RL environments and tasks.

Experience designing environments/tasks for RL or evaluations.

Ability to meet the throughput expectation (1 task / 3–5h) and respond within 24 hours to feedback.

Comfortable onboarding quickly and working independently.

Ability to work with at least 4 hours overlap to PST business hours.

Advanced English (C1/C2) for specifications, reviews, and feedback.

Working conditions

Remote, independent contractor engagement.

Deliverables-driven; begin shipping on day one.

Contacts

zfec7b0038be4

Posted:

Hidden
Reinforcement Learning Environments Engineers

Moderation Review

Brief description of the vacancy

About the company

Responsibilities

Requirements

Working conditions

Contacts

HiddenReinforcement Learning Environments Engineers

Moderation Review

Brief description of the vacancy

About the company

Responsibilities

Requirements

Working conditions

Contacts

Hidden
Reinforcement Learning Environments Engineers