

Presented by
BlueDot Impact
We’re building the workforce needed to safely navigate AGI.
Contact: [email protected]
AI Safety Evals - Paper Reading Club
Registration
About Event
Continuing our theme of research with implications for recursive self-improvement, this week Justin Dollman will present HCAST: Human-Calibrated Autonomy Software Tasks, a benchmark of real-world engineering and reasoning tasks and the results of testing frontier LLMs against them.
Every week, someone will present for 20-30 minutes followed by discussion. RSVP to join, sign up to present, or write to [email protected] if you have questions. Everyone is welcome!
Presented by
BlueDot Impact
We’re building the workforce needed to safely navigate AGI.
Contact: [email protected]