OpenAI Pays $25k to Break Its Own Bio Guardrails

Published: April 25, 2026 at 12:29 AM

Updated: April 25, 2026 at 12:29 AM

100-word summary

OpenAI is paying hackers up to $25,000 to jailbreak GPT-5.5's biology safeguards. The challenge: craft a single prompt that defeats all five of the model's bio safety questions without triggering moderation filters. It's open only to vetted red-teamers who sign NDAs, with testing running through July 2026. The bounty targets universal jailbreaks, the kind that work every time rather than occasionally slipping through. Smaller prizes go to partial successes. The twist? OpenAI is essentially crowdsourcing the exact attack vectors that could make its most advanced model dangerous, betting that finding them first is safer than waiting for someone else to discover them in the wild.

What happened

Why it matters

The bounty targets universal jailbreaks, the kind that work every time rather than occasionally slipping through. Smaller prizes go to partial successes. The twist? OpenAI is essentially crowdsourcing the exact attack vectors that could make its most advanced model dangerous, betting that finding them first is safer than waiting for someone else to discover them in the wild.

Sources

OpenAI