A concise summary of the capability concern (e.g., "Model exhibits deceptive behavior in negotiation tasks")

Include prompt text, system prompts, temperature settings, and any scaffolding used. The more detail, the faster we can verify.

Paste links to any supporting evidence (GitHub repos, screenshots, conversation logs, etc.)

If you'd like to receive updates on your report. We never share your contact information.

Reports are typically triaged within 4 hours.

What Happens After You Submit

1. Automated Corroboration

Our AI-assisted research system will attempt to find supporting evidence and related reports within 4 hours.

2. Expert Review

A verified reviewer will assess the report, attempt replication, and determine severity classification.

3. Public Disclosure

Verified incidents are added to our public Threshold Tracker with appropriate detail level.

4. Ongoing Monitoring

We track whether the issue persists across model versions and notify relevant stakeholders.