Similar Items: Principles and Guidelines for Randomized Controlled Trials in AI Evaluation
- When Should Teachers Control AI Generation for Mathematics Visuals?
- Towards Apples to Apples for AI Evaluations: From Real-World Use Cases to Evaluation Scenarios
- "It depends on where AI is used": Players' attitude patterns and evaluative logics toward different AI applications in digital games
- Useful for Exploration, Risky for Precision: Evaluating AI Tools in Academic Research
- UX in the Age of AI: Rethinking Evaluation Metrics Through a Statistical Lens
- The Missing Evaluation Axis: What 10,000 Student Submissions Reveal About AI Tutor Effectiveness