Requirements
• Master’s degree in Computer Science, AI, ML, or a related technical field,
• (Desirable) Deep knowledge of transformer internals or LLM training/inference,
• Strong Python skills with production-quality engineering standards,
• (Desirable) Experience with inference libraries such as vLLM or SGLang,
• Experience designing or working with RL environments or training pipelines,
• (Desirable) CUDA or custom kernel optimization experience (e.g. Pallas),
• Solid understanding of modern LLMs and their limitations,
• (Desirable) Research experience with publications or high-quality open-source work,
• Ability to work quickly, iterate reliably, and respond to feedback,
• (Desirable) Experience building complex or open-ended RL-based learning systems,
• Advanced English proficiency (C1/C2)
What the job involves
• Design and build reinforcement learning environments for training and evaluating LLMs,
• Translate modern ML and AI research into structured RL problems,
• Implement reliable, debuggable, and scalable training environments in Python,
• Collaborate with researchers and engineers to improve model learning quality,
• Complete an average of two well-scoped tasks per week,
• Iterate quickly based on feedback and evaluation results
Apply Now
Apply Now