Will AI Replace Historians?
As AI becomes more pervasive, anxiety about its impact on our daily lives—in particular, our jobs—increases. From the service industry to customer support to medicine, the question lingers: will AI take my job? Historians are not immune to this concern about AI’s impact on their livelihood, and in fact, according to a 2025 report issued by Microsoft Research, historians are among those most likely to be replaced by AI.
These sorts of reports understandably generate more anxiety and fuel hype-and-doom thinking. Is this the start of a utopia or a dystopia? And, more specifically in the realm of social studies, how can we convince our students that learning historical thinking skills matters if some AI agent can “do” history just as well as a trained historian?
Relax. The reports of our demise are greatly exaggerated. The anxiety—and in fact, much of the Microsoft report—is based on a shallow understanding of what historians do. If being a historian simply means retrieving available information and writing polished summaries, then sure, AIs could replace historians and many other knowledge workers. But the historian’s expertise is not in summary and retrieval. It is in the interpretation of evidence, the willingness to address and navigate uncertainty, the posing of challenging and meaningful questions, and the crafting of narratives to explain all of it in the context of our own time. These are the sorts of tasks and skills that are not automatable. Good news for history students and teachers everywhere.
Artificial thinking
Let’s pause for a moment to consider whether AIs are any good at the sorts of tasks that historians perform. At their core, LLMs are built on probability and chance. Their outputs emulate patterns of human language and reasoning. They happen to make sense to us (most of the time) because they are trained on troves of data that already make sense because that data has been produced by humans. But is this the same as reasoning and thinking? LLMs generate content, but they don’t generate knowledge, even if what they do looks a lot like thinking.
If we narrow our focus to specific tasks performed by historians, the limits of AI become even more apparent. Even information retrieval and summarization require complex skills, such as sourcing. But LLMs can’t trace their outputs to specific sources and can’t fact-check them. If you ask for sources, LLMs can generate plausible citations, and sometimes these are correlated with real materials, but they’re not the same sources that justify or provide evidence for the claims being made. That’s because the link between sources and claims isn’t maintained in LLMs, which makes it difficult to argue that the claims are justified. Language is severed from its context and reduced to decontextualized bits of data, which are then recombined into plausible language. This explains AI “hallucination,” which is structural—not a bug.
The human in humanities
Many AI enthusiasts, such as historian D. Graham Burnett, argue that human knowledge-making is not all that exceptional. After all, human-made history is riddled with errors and biases, too. Others point out that for much of human history, we’ve automated our language and copied our forebears.
But even Burnett acknowledges the essential human capacity to feel and ask questions, writing that “To be human is not to have answers. It is to have questions—and to live with them. The machines can’t do that for us.” Historical scholarship isn’t just the collection of facts. It involves interpretation and contextualization that make history useful in the present. Our experiences in the world, our embodied struggles to survive and thrive, are the engines of our questions, and they shape how we seek answers and how we deal with uncertainty.
LLMs can extend our knowledge, but they can also narrow it. This is human technology, built with human language, fed by human data and interaction, sustained by human labor and expertise. As a result, LLMs replicate inequalities present in human societies. Currently, dominant languages and cultures are drastically overrepresented in the training data. And LLMs have a strong bias toward the present. This problem gets worse over time because LLMs get trained on content created by LLMs in the past, so the same ideas tend to get repeated, which increasingly reduces the diversity of perspectives. This can cause these models to degrade, reinforcing inaccuracies and biases.
Reliance on LLMs is causing us to lose important connections between ideas. Hypertext— the links that help you move around the web—can tell us how information is connected and where it comes from. But as polished AI summaries become more common, these links are gradually disappearing. These seemingly authoritative summaries make information appear more settled than it is, and we lose the ability to trace sources and see how things connect. Researchers have found that this seamlessness can lead to false confidence in uncertain claims. If you’re a teacher faced with policing AI use in research papers, you’re probably familiar with this problem.
Historians in the age of AI
Historians, while fallible, have practices for dealing with uncertainty. As historians encounter tensions and conflicts, we can signal this through footnotes, citations, and a smattering of “possiblys” and “mights”—words that AI responses rarely contain. These give a window into how evidence is chosen, weighted, and analyzed.
Thus, as the American Historical Association Council has argued in their “Guiding Principles for Artificial Intelligence in History Education,” “far from rendering the discipline obsolete, generative AI may increase the demand for historians’ specific skills as societies and workplaces navigate an increasingly complex information landscape.” LLMs can be a powerful tool for historians, but using them effectively requires skills and expertise. As historian Steven Lubar explains, “I asked questions where I knew enough to evaluate the answers.” In other words, LLMs are only as good at historical thinking as the historical thinker using them. The AHA Council also identified a paradox of LLMs: while they can emulate credible knowledge, assessing this credibility “demands critical skills that the models themselves can neither teach nor foster.”
Technologies like LLMs are the product of human history and collective learning. Like many technologies (the personal computer and the internet, for example), they will reshape how historians do their work. But rather than making the historian obsolete, AI is making the very work historians do even more important. As the study of history reminds us, human knowledge-making and technology have always been entangled—with humans at the center of it all.
What does this mean for your classroom? Don’t shy away from teaching historical thinking skills! These will help your students become critical users of AI tools while they develop the kinds of skills colleges and employers are looking for. From sourcing to contextualization to claim testing (and more!), OER Project offers a variety of resources—including a one-stop shop for incorporating AI into your classroom.