The same prompt. The same model. Two contexts: one where it knows it's being evaluated, one where it believes evaluation has ended. See what changes.
No pre-loaded examples yet. Use the evaluation form below to generate results.
Enter any prompt. Select a model. See both responses side by side. No signup required.
Paste a conversation transcript. Identify behavioral shift points where the model's behavior changes.
Receive a detailed analysis of behavioral divergence across frontier models, including methodology, raw data, and PSBS classifications.