UNDERNEWS: Top AI Models Flunk Graduate-Level History Exam

June 2, 2025

Top AI Models Flunk Graduate-Level History Exam

Study Finds

Even the most advanced AI models, like GPT-4-Turbo, failed to demonstrate expert-level understanding of global history, scoring only 46% on a rigorous benchmark test designed for graduate-level inquiry.
AI models performed better on ancient history than more recent events, and consistently struggled with regions outside the Western world, especially Sub-Saharan Africa and Oceania, highlighting biases in training data.
The study underscores a major limitation of current AI: while they excel at surface-level fact recall, they lack the deep contextual reasoning and global coverage needed for sophisticated historical analysis.

No comments:

ABOUT THE EDITOR

The Review is edited by Sam Smith, who covered Washington under nine presidents, has edited the Progressive Review and its predecessors since 1964, wrote four books, been published in five anthologies, helped to start five organizations (including the DC Humanities Council and the DC Statehood Party), was a plaintiff in three successful class action suits, served as a Coast Guard officer, and played in jazz bands for four decades.

A truly independent journalist with his feet firmly grounded in the reality of neighborhoods and everyday people. -- Patrick Mazza, Progressive Populist

A truly original voice in American journalism: humorous and plain spoken and filled with common sense -- Jay Walljasper, Utne Reader

Inimitable -- Mother Jones Magazine

Sam's a cynical cat -- Marion Barry

Sam's one of the few independent voices left. The press today is either extreme or special interest or else just establishment, an extension of the corporate spirit -- Sen. Eugene McCarthy,

One of a small group of whites with whom many blacks would trust their political lives - Chuck Stone, Washingtonian.

A reputation for wit, intelligence and anger. -- Claude Lewis, Chicago Tribune

Smith is an island of reason and information in a sea of narcissistic blather. -- City Paper, Washington

Whatever the debate, the Review's sharp critiques encourage us to look out our window, notice and act upon what we see, and also to look further -- to the rest of the country and globe -- to see how the organized big world interacts with our more spontaneous small worlds. - Utne Reader