ML Evaluation Designer – Expert (Remote Contract)
Mercor · Uruguay
Descripcion del puesto
About the role
Mercor is seeking an experienced Machine Learning Engineer to design and refine evaluation tasks for large language models. This remote contract position focuses on creating robust rubrics, metrics, and grading processes that directly improve model performance.
Key responsibilities
- Design ML/LLM evaluation tasks, rubrics, and metrics.
- Grade model and agent outputs, ensuring high‑quality evaluation data.
- Apply training‑side judgment in SFT, RLHF, and reward modeling to shape evaluation design.
- Collaborate with AI research teams to refine evaluation signals and enhance model outputs.
- Work independently and asynchronously to meet deadlines and drive AI model improvements.
Required profile
- 5+ years of experience as a Machine Learning Engineer with hands‑on training and evaluation work.
- Strong written communication skills.
- Preferred experience with SFT, RLHF, reward modeling, and evaluation metrics.
Required skills
- PyTorch
- JAX
- Hugging Face
- SFT
- RLHF
- Reward modeling
What we offer
- Competitive hourly rate ranging from $45 to $140.
- Fully remote work environment.
- Flexible schedule with a commitment of 30+ hours per week.
- Opportunity to influence cutting‑edge AI research and model performance.
Questions fréquentes
Por que reporta esta oferta?
Postula en 30 segundos
Ingresa tu email para postular. Se creara una cuenta automaticamente.
Al continuar, aceptas nuestras condiciones de uso.
Ya tienes cuenta? Iniciar sesion
Publicado hace 5 horas
Expira en 1 mes
5 vistas · 0 interested
Aumenta tus posibilidades
Sube tu CV: te propondremos las ofertas que coinciden con tu perfil.
Analizando tu CV...
Mercor
Uruguay