Learning (Still) Needs Knowledge
What’s changed since Kirschner, Sweller and Clark told us minimal guidance doesn’t work?
I recently revisited what I still think is one of the most important and influential papers in education. It’s still brilliant, but it got me wondering what had changed in the twenty years since it was published. It came back to me because I’d been noticing a run of posts on the edu-socials arguing that AI means students no longer need knowledge, the same “it can be outsourced or looked up with a few taps on a screen” line once used to claim the internet would make knowledge redundant, and that discovery learning, dressed up as critical thinking and communication skills, is once again the way forward.
That paper was Kirschner, Sweller and Clark’s Why Minimal Guidance During Instruction Does Not Work (2006). Discovery, inquiry (enquiry?), problem-based, experiential and constructivist teaching, pedagogies that ask novices to figure things out for themselves, are less effective than explicit teaching which shows students clearly what to do and how to do it, with checks for understanding. The reason it works is biology, not ideology.
Working memory is tiny. Long-term memory is where what we learn lives. Asking novices to search for solutions burns up the former without reliably building the latter. And with learning defined as a change in long-term memory, if long-term memory hasn’t changed, nothing has been learned.
That’s the argument. The paper was released the year I completed my teacher training, which was dominated by the more ideological, progressive and romantic pedagogies it critiques. I didn’t even know the paper existed for the best part of a decade as a qualified teacher. Now, two decades on from publication, with arguments still all over the internet for some variety of discovery learning, I wanted to see whether the science had changed or whether it still held. As recently as the last few weeks I’ve heard/seen teachers and senior leaders arguing against a knowledge-rich curriculum, or claiming that content is not important and skills are.
So I went on a deep dive to see where that thread of educational research had got to, and to check whether it was me being wrong again, whether I needed to flip my beliefs back on the strength of some new cognitive findings I’d missed.
The short answer is no, I don’t. The findings still hold, and the evidence base is stronger than ever. But the picture has become more interesting, and anyone involved in education should know about it if they want to be sure they are doing their best by their students. Very few people are intentionally limiting children’s ability to learn by stripping out content and teaching skills. There is positive intent behind what they believE and are doing, and AI has brought this type of learning back into the conversation. What the research shows, though, is that some dangerous assumptions remain firmly held, even with the case for knowledge-rich teaching close to conclusive.
As part of my deep dive, I listened to the audiobook of E. D. Hirsch’s Why Knowledge Matters (2016), and I think it should be required reading for everyone in education, especially heads and any leader responsible for curriculum and teaching quality. Hirsch’s work supports the findings of cognitive science. He pulls it up to the level of curriculum and equity, and makes the case in a way that’s hard to ignore.
The equity argument was the part I found most interesting. A skills-based, child-centred curriculum looks generous and open, and was often constructed to help reduce inequity… “poor kids shouldn’t be learning about Mesopotamia”. They need skills! Hirsch refers to the Matthew Effect… In practice, child-centred approaches advantage the children who are already advantaged, because middle-class kids are likely to have more chance to absorb any missing background knowledge at home while disadvantaged kids are less likely. If you leave content to chance you widen the knowledge gap. A specified, content-rich common curriculum is the great equaliser, not the constraint progressives long assumed. Remember, knowledge is what we think with, and cutting out all the knowledge reduces a child’s capacity to think.
The part of Hirsch’s writing everyone should look at is about the changes in France’s education system. France ran one of the strongest and most equitable elementary systems in Europe through the 1980s, built on a common, knowledge-rich curriculum. The 1989 Loi Jospin shifted the system towards skills, child-centredness and individualism (prologue and first chapter of Knowledge Matters). By 2007 achievement had fallen and inequality had risen sharply, while buildings, budgets, teacher quality and early years provision had stayed roughly constant. The curriculum philosophy was the only major thing said to have changed. It reads less like a policy argument than a natural experiment, which is why it gets cited and attacked so often.
A note of honesty on that, because it bears on how much weight you put on it. The France chapter is quite a contested part of Hirsch’s case. Critics point to PISA comparability problems, demographic change across the period and the fact that more than one thing shifted at once. It is still persuasive directionally, but it’s weighted differently from the findings of cognitive science. Kirschner, Sweller and Clark rest on decades of controlled experiments. Hirsch rests more on historical synthesis and case study. He points in the same direction as the science, and I think he’s broadly right, but France is an illustration of the argument, not the proof of it. The proof sits in the lab work. Hirsch exposes what that lab work means for curriculum when the science of learning is ignored and systems move away from communal knowledge towards individualised, child-centred approaches.
Hirsch is convincing in that curriculum philosophy was a major part of the shift, especially once you’ve read the science that supports this. And he consistently points back to the work of cognitive scientists like Daniel Willingham. His Why Don’t Students Like School? (2009) puts the mechanism underneath Hirsch’s claim. The line I quote most from him is: memory is the residue of thought. We remember what we think hard about. That sounds obvious until you follow it through properly, because it means a lesson that’s busy and engaging but doesn’t make students think about the right things will often leave very little behind. Activity is not learning, as I’ve written about before.
Willingham also dismantles the idea that critical thinking is a transferable skill detached from knowledge. Thinking well in a domain depends on having a lot of domain knowledge stored in long-term memory, because that is what frees up working memory to reason. The expert doesn’t think harder than the novice. The expert thinks with more already in place, so the hard part costs them less. Hirsch tells you knowledge is decisive for comprehension. Willingham tells you why, in terms of how memory and attention actually work.
The science of learning’s core findings have hardened
The 2006 paper has over 9,000 citations and the weight of evidence behind it has only grown. Mayer exposed a decade-by-decade pattern in which minimal guidance gets rebranded and pushed, then fails, then gets rebranded again. The names now include “deeper learning”, “agency” and “21st-century skills”. Same argument, same evidence against it. The pattern is still live and running.
The word “agency” on that list disappointed me, because I’m a strong believer in agency for students. So are discovery learning and agency the same thing? Thankfully not, but they do get tangled together constantly, and I felt the need to pull that knot apart to check…
Discovery learning is a specific instructional approach. It’s the thing Kirschner, Sweller and Clark are attacking: hand novices a problem with minimal guidance and let them construct the knowledge themselves. As Mayer exposed, it travels under a lot of names, discovery, problem-based, inquiry, experiential, constructivist, but the KSC paper treats them as pedagogically equivalent. The argument is that all of them fail because they ignore working memory limits, schema and what novices actually need, which is guidance.
Agency is a broader and fuzzier term. It usually means something like ownership, autonomy or increasing self-direction. It’s about motivation and control, not a teaching method.
The confusion is that a lot of progressive pedagogy bundles the two together: the assumption that to give students agency you have to let them discover things for themselves, and that direct instruction strips agency away. There’s some slippage there worth catching and the KSC argument breaks that link. You can give students strong guidance, worked examples and explicit instruction while still building genuine agency, in the sense of motivation, investment and increasing independence as knowledge grows. The paper’s own point is that guidance can be relaxed as expertise increases, because knowledge in long-term memory gradually takes over from external scaffolding. Agency, read that way, becomes the output of good instruction, not the method itself.
So when a document or colleague slides from “we want students to have agency” to “therefore we need discovery learning”, it is worth flagging. Agency is a goal. Discovery is one poorly evidenced means people assume will get them there.
Agency is just like critical thinking and problem solving. In Willingham’s excellent piece “Ask the Cognitive Scientist: How Can Educators Teach Critical Thinking?”, he explains that you cannot navigate a landscape you cannot see. In the architecture of the human mind, deep domain knowledge is what frees up the cognitive bandwidth required to make meaningful, autonomous choices. Without a foundation of facts, what we call ‘agency’ is just blind guessing in the dark.
The debate has shifted, though.
Hmelo-Silver and colleagues pushed back in 2007: it has been argued that problem-based learning isn’t really minimally guided, because there are tutors, structured problems and scaffolds. Kirschner, Sweller and Clark’s reply was swift. Fine, but if you concede scaffolding works, why stop short of the strongest scaffolds, like worked examples?
The consensus now is less about guided versus unguided instruction and more about how much guidance, when, and how you fade it. That includes genuine disputes like Kapur’s work on productive failure, where letting learners with some prior knowledge struggle first can sometimes pay off. The argument has effectively moved onto a continuum, and yet some people are still arguing, using the very science that debunks it, that discovering things for yourself is the best way to learn.
The feeds are full of posts calling for schools to give students more choice and freedom, and arguing that learning facts is increasingly irrelevant in the age of AI. It’s understandable when non-educational experts assert this, but it frustrates me how many of the comments agreeing with these posts come from teachers and educational leaders who should know better.
CLT got a major upgrade in 2019
Sweller, van Merriënboer and Paas revisited cognitive load theory and folded in evolutionary psychology, importing David Geary’s distinction between biologically primary knowledge, things like speaking, reading faces and basic social learning, which we evolved to acquire naturally, and biologically secondary knowledge: reading, writing, maths, science and almost everything schools teach. Primary knowledge can be acquired through immersion and play. Secondary knowledge generally cannot. It has to be explicitly taught.
This is, I think, one of the most useful ideas for anyone trying to explain to a sceptic why “let them discover it” appears to work in a playground but fails in a science classroom, or in some Early Years settings where much of what’s being learnt is biologically primary knowledge. Though it is worth revisiting Hirsch here too, who accepted the role of play in the early years but argued, with strong case studies, that it should not be devoid of knowledge and should still carry communal curriculum content if we want stronger later outcomes.
The cognitive load toolkit gained more besides. The generation effect, which supports the need for learners to produce output rather than simply consume information through listening or reading. And environmental effects, which showed that physical space shapes cognitive load more than we once realised. I built this into the redesign of our business and economics classrooms, opting for almost nothing on the walls beyond whiteboards.
Element interactivity, the mechanism underneath the rest
Sweller’s 2010 work reframed intrinsic load around element interactivity: how many things must be held in mind at once. Low interactivity, like periodic table symbols, is manageable. High interactivity, like algebra, essay writing or grammatical structures, overwhelms working memory without guidance. Worked example effects, expertise reversal, split attention… much of it is now better understood through this lens. It gives teachers a clearer map of the brain’s architecture and helps explain why some techniques work so reliably.
Possibly the biggest endorsement came from everyone’s favourite, Dylan Wiliam. I suspect his tweet calling cognitive load theory the single most important thing teachers should know did more for the theory’s reach than any journal article ever managed.
The expertise reversal effect has become a cornerstone finding: what helps novices can hinder experts. It’s why the curse of expertise can be so dangerous in classrooms, and why my dad could never understand why I struggled with his maths explanations as a kid. He was brilliant at maths, but I needed someone to explain it more simply. Being an expert in your subject is not enough. You need an understanding of how novices learn if you want to build expertise without overloading working memory. Guidance must eventually fade, because scaffolds that never disappear become crutches that limit further learning. Teaching is, in part, the art of knowing which scaffolds to use and when to gradually remove them.
But does it replicate?
Yes. The replication crisis actually strengthened cognitive load theory more than it weakened it. Sweller has argued that replication failures often revealed missing variables, such as element interactivity, rather than falsifying the theory itself.
What has changed in practice?
Teachers now know far more about the science of learning than they did twenty years ago. Clark, Kirschner and Sweller followed up with 2012’s Putting Students on the Path to Learning: The Case for Fully Guided Instruction in American Educator, a more accessible, teacher-facing restatement of their original paper and the newer findings that followed. Worth noting that the same edition also included Barak Rosenshine’s Principles of Instruction, whose recommended reading points directly back to the 2006 paper. Worth mentioning that this one journal issue contained Rosenshine and a teacher-facing restatement of Kirschner, Sweller and Clark. Some line-up!
Most teacher training courses now include cognitive load theory in some form, and Australia’s NSW Department of Education has made it central to its teaching framework. AERO has also strongly promoted explicit instruction and the science of learning.
Nearly every teacher will by now have encountered Rosenshine’s Principles of Instruction, which translated the research into ten practical classroom moves and became the backbone of many schools’ CPD programmes. Doug Lemov’s Teach Like a Champion series synthesised much of the same science into practical explicit instruction techniques and remains one of the most useful handbooks available to teachers.
Since around 2019, Ofsted has also shifted more explicitly towards valuing knowledge-rich curriculum approaches. Alongside organisations like the Education Endowment Foundation and Evidence Based Education, there is now far more infrastructure helping research reach schools and classrooms than there was when the original paper was published.
So practically speaking, and I had this exact conversation with a good mate and colleague over lunch last week, the evidence is now overwhelmingly clear that to teach well, teachers should:
Teach knowledge explicitly. Don’t expect students to construct what you haven’t given them.
Use worked examples generously with novices.
Build schema first. Discovery and open inquiry are rewards for expertise, not routes to it.
Let scaffolds fade. Crutches should not become permanent.
Understand that “authentic” tasks are not automatically effective tasks. Many simply overload working memory.
Check for understanding regularly.
Know that liking a method does not mean students are learning from it.
Remember that students often prefer less-guided approaches while learning less from them. Learning should involve struggle, but the brain naturally prefers the path of least resistance.
That final point urges caution around “student voice” when it comes to instructional design. Asking novices about their preferred teaching method is not always a reliable proxy for what helps them learn most effectively. Students may enjoy a lesson spent exploring on iPads, but enjoyment and durable learning are not the same thing.
The debate about whether explicit instruction and knowledge-rich curriculum approaches work is settled, or at least it should be. They do. What isn’t settled is the practice, or the politics.
What remains is the harder and more interesting work: helping leaders and teachers think carefully about how much guidance students need, when they need it, and how that guidance gradually fades as knowledge and expertise grow. Because the endpoint of explicit teaching was never dependence. It was always independence.




