Research and Insights

“Playing Pretend: Expert Personas Don’t Improve Factual Accuracy”

This study investigates whether persona prompting improves AI performance on challenging academic benchmarks. We find that despite widespread adoption, assigning expert personas (e.g., “You are a world-class physics expert”) does not reliably improve accuracy. Domain-mismatched experts sometimes degrade performance, and low-knowledge personas (layperson, young child, toddler) often reduce accuracy. These results suggest practitioners should focus on task-specific instructions rather than persona assignment.

Three overlapping silhouettes symbolize diverse thinking and collaboration. Surrounding icons represent science, technology, and innovation.

“I’ll pay you or I’ll kill you — but will you care?”

This study rigorously tests whether threatening or tipping AI models improves their performance. We find that despite popular claims, these prompting strategies do not significantly enhance results on challenging benchmarks. While individual questions may see dramatic performance swings based on prompt variations, there’s no reliable way to predict which questions will benefit from which prompts.

Illustration of chess pieces, coins, gears, and warning signs, suggesting themes of strategy, risk, and financial planning.

“Call Me A Jerk: Persuading AI to Comply with Objectionable Requests”

Our research suggests large language models exhibit parahuman responses to persuasion techniques, despite not being human. We found that classic persuasion principles like authority, commitment, and unity can dramatically increase an AI’s likelihood to comply with requests they are designed to refuse. For GPT-4o-mini, these techniques more than doubled compliance rates (72.0% vs. 33.3% in controls). This emergence of “parahuman” tendencies suggests that social scientists have a valuable role to play in understanding AI behavior.

Mid-century-modern–style landscape illustration on an off-white background. At center is a split side-profile head: the left half a solid black human silhouette, the right half teal with gold circuit lines, symbolizing AI with parahuman traits. Surrounding the head are flat, colorful icons—speech bubbles, magnifying glass, network node cluster, bar chart, upward arrow, light-bulb, and cloud—in mustard yellow, burnt orange, teal, and black—conveying social-science analysis of AI.

“The Decreasing Value of Chain of Thought in Prompting”

This report investigates Chain-of-Thought (CoT) prompting, which encourages LLMs to ‘think step by step.’ We tested this common prompting approach and found that its effectiveness varies significantly by model type and task: non-reasoning models show modest average improvements but increased variability in answers, while reasoning models gain only marginal benefits despite substantial time costs (20-80% increase). These findings challenge the assumption that CoT is universally beneficial.

Illustration of a brain surrounded by various icons such as gears, puzzle pieces, a magnifying glass, and a light bulb, connected by dotted lines, representing creative thinking or problem-solving.

“Prompt Engineering is Complicated and Contingent”

This report is fundamentally about exploring the variability in language model performance, not the models themselves. As experimenters, we demonstrate how the same model can produce dramatically different results based on small changes in prompting and evaluation methods – a critical consideration for real-world applications.

Abstract illustration featuring various charts and graphs, including bar charts, line graphs, pie charts, and geometric shapes in red, yellow, and blue, representing data visualization and analysis.

“AI Agents and Education: Simulated Practice at Scale”

Generative AI has the potential to significantly lower the barriers to creating effective, engaging simulations. This opens up new possibilities for experiential learning at scale. By leveraging a system of multiple AI agents, simulations can provide personalized learning experiences, offering students the opportunity to practice skills in scenarios with AI-generated mentors, role-players, and instructor-facing evaluators. In this paper we describe a prototype, PitchQuest, a venture capital pitching simulator that showcases the capabilities of AI in delivering instruction, facilitating practice, and providing tailored feedback.  We discuss the pedagogy behind the simulation, the technology powering it, and the ethical considerations in using AI for education, while acknowledging the limitations and need for rigorous testing.

A flowchart illustrating a process involving different agents: Mentor Agent, Investor Agent, Evaluator Agent, Progress Agent, and Class Insights Agent. Each agent plays a role in analyzing and summarizing learner activities and performance, resulting in individual and class-level reports

“Instructors as Innovators: A Future-Focused Approach to New AI Learning Opportunities, with Prompts”

This paper explores how instructors can leverage generative AI to create personalized learning experiences for students that transform teaching and learning. We present a range of AI-based exercises that enable novel forms of practice and application including simulations, mentoring, coaching, and co-creation. For each type of exercise, we provide prompts that instructors can customize, along with guidance on classroom implementation, assessment, and risks to consider. We also provide blueprints, prompts that help instructors create their own original prompts. Instructors can leverage their content and pedagogical expertise to design these experiences, putting them in the role of builders and innovators. We argue that this instructor-driven approach has the potential to democratize the development of educational technology by enabling individual instructors to create AI exercises and tools tailored to their students’ needs. While the exercises in this paper are a starting point, not a definitive solutions, they demonstrate AI’s potential to expand what is possible in teaching and learning.

Text-based image titled "Negotiations Simulator Prompt" detailing a role-playing scenario guide for negotiations, with steps like gathering information and setting up role-play.

“Assigning AI: Seven Approaches for Students, with Prompts”

The incredible promise of AI as a way for students all over the world, of all ability levels, to learn is undeniable. Education is our most powerful system for increasing social mobility, unlocking potential, and improving lives. A tool that can help with this has tremendous implications. Plus, students are already using AI for direct help. Teaching them how to do it responsibly may alleviate some of the negative implications of our AI moment. In this paper, we tackle ways that students can be assigned to use AI directly. We don’t shy away from the dangers but provide detailed instructions on how students and instructors can think about each of the tools we suggest.

Chart illustrating an AI Tutor prompt structure with colored sections for role, instructions, pedagogy, constraints, and personalization.

“Using AI to Implement Effective Teaching Strategies in Classroom: 5 Strategies, including Prompts”

In the rush to deliver AI benefits directly to students, the role of instructors is often overlooked. AI tutors, as exciting as they are, do not replace the complex role of a teacher in front of a class. This paper provides evidence-informed AI approaches to pedagogy to make teaching easier and more effective.

A hand typing on a laptop keyboard with digital data graphics overlay, including charts, binary code, and icons, representing technology and data analysis.

Questions? Contact Us

Logo featuring the Wharton School of the University of Pennsylvania with text "Generative AI Labs" and a shield emblem.