Direct apply
AI Safety Engineer
Anthropic
San Francisco, CA · San Francisco, CA, US, 94105
Job description
Anthropic is hiring an AI Safety Engineer to ensure Claude and future AI systems are safe, reliable, and aligned with human values. You'll work at the cutting edge of AI alignment research and engineering.
- Implement and evaluate safety techniques including Constitutional AI and RLHF
- Build evaluation frameworks to measure model safety and alignment properties
- Contribute to safety research publications and technical reports
- Collaborate with policy teams to translate technical safety work into guidelines
Required skills
- Python
- PyTorch
- RLHF
- LLMs
- Safety Evaluation
- Research
Benefits
- Equity
- health
- 401k
- meals
- relocation