AI Experiment

⚔️ AI Prompt Duel

Two AI models. One prompt. You decide which response is better.

12 duels

4,352 community votes

How it works

This experiment demonstrates blind evaluation of AI model outputs — the same methodology used in RLHF (Reinforcement Learning from Human Feedback) to align language models with human preferences. Model identities are hidden to reduce bias, and responses are shown side-by-side for fair comparison.

The prompt/response pairs are pre-computed to illustrate how different models approach the same task with varying styles, depth, and accuracy. In production systems like Chatbot Arena, this approach generates preference data used to train reward models.

Built with: Next.js · Chakra UI · Framer Motion · Pre-computed response pairs · Client-side voting with running tallies

View Source →