JS
#26ResearcherAI Safety & Alignment

John Schulman

Anthropic (formerly OpenAI)

Specialties

Reinforcement LearningAI Safety

Location

USA

Education

PhD in Computer Science, UC Berkeley (2016).

Biography

Co-founder of OpenAI. Creator of PPO algorithm. Joined Anthropic in 2024. Pioneer of deep reinforcement learning.

Key Influence

Invented PPO — the RL algorithm behind RLHF training of ChatGPT. Co-founded OpenAI.