- Even the most advanced AI models, like GPT-4-Turbo, failed to demonstrate expert-level understanding of global history, scoring only 46% on a rigorous benchmark test designed for graduate-level inquiry.
- AI models performed better on ancient history than more recent events, and consistently struggled with regions outside the Western world, especially Sub-Saharan Africa and Oceania, highlighting biases in training data.
- The study underscores a major limitation of current AI: while they excel at surface-level fact recall, they lack the deep contextual reasoning and global coverage needed for sophisticated historical analysis.
No comments:
Post a Comment