AI Safety Engineer

Anthropic

San Francisco, CA · San Francisco, CA, US, 94105

Apply now

Job details

Apply now

Type: FULL TIME
Salary: $240,000 – $340,000
Industry: Artificial Intelligence
Posted: Posted 3 hours ago
Closes: September 30, 2026
Experience: 5+ years
Education: Postgraduate Degree

Job description

Anthropic is hiring an AI Safety Engineer to ensure Claude and future AI systems are safe, reliable, and aligned with human values. You'll work at the cutting edge of AI alignment research and engineering.

Implement and evaluate safety techniques including Constitutional AI and RLHF
Build evaluation frameworks to measure model safety and alignment properties
Contribute to safety research publications and technical reports
Collaborate with policy teams to translate technical safety work into guidelines

Required skills

Python
PyTorch
RLHF
LLMs
Safety Evaluation
Research

Benefits

Equity
health
401k
meals
relocation

Ready to apply?

Click below to visit the application page for this role.

Apply now

AI Safety Engineer

Job description

Required skills

Benefits

Similar open roles

Never miss the right role