RL EnvironmentsKernel OptimizationGPU/CUDACompilers (LLVM/MLIR)PyTorch ExtensionsDistributed Inference (vLLM/NCCL)
Brief Description of the Role
We're hiring Low-Level Engineers to design and build RL environments that teach LLMs kernel development, hardware optimization, and systems programming. The goal is to create realistic feedback loops where models learn to write high-performance code across GPU and CPU architectures.
This is a remote contractor role with ≥4 hours overlap to PST and advanced English (C1/C2) required.
Company Preference Model via XOR.AI
About the Company
Preference Model is building the next generation of training data to power the future of AI. Today's models are powerful but fail to reach their potential across diverse use cases because so many of the tasks that we want to use these models are out of distribution. Preference Model creates RL environments where models encounter research and engineering problems, iterate, and learn from realistic feedback loops.
Our founding team has previous experience on Anthropic's data team building data infrastructure, tokenizers, and datasets behind the Claude model. We are partnering with leading AI labs to push AI closer to achieving its transformative potential.
The company is backed by Tier 1 Silicon Valley VC.
Minimal Qualifications
Remote contractor role, flexible schedule.
Hourly contractor rate: 90-125 USD/hour (dependent on the expertise level and quality of take-home assignment).
Log InOnly registered users can open employer contacts.
Our website uses cookies, including web analytics services. By using the website, you consent to the processing of personal data using cookies. You can find out more about the processing of personal data in the Privacy policy