Confusing the analytic with the empirical:
a problem for educational researchers

Kieran Egan
Faculty of Education
Simon Fraser University
Burnaby, B.C. Canada V5A 1S6


Herbert Spencer, writing in the 1850s, was one of the first to suggest that learning and development are parts of the natural world and exhibit regularities and, if we observe closely and properly, laws. And so, he wrote, “it follows inevitably that education cannot be rightly guided without knowledge of these laws” (1911, 23). He argued that the application of the methods of science to educational phenomena would enable education to progress as have other areas where the methods of science have been brought into play.

Since Spencer’s day, educational researchers have generated a large literature to explain why the anticipated success is not at the time of their writing evident in educational practice. Each generation of researchers has had to explain first what was wrong with the previous generation’s research—their theories or methods--and then why the new approach will soon start to deliver the goods. We can see this repeated many times in the literature on educational research during the past century. In the working lives of many of us, the methodological and conceptual errors of behaviorism have been discounted and the new “cognitive science” now lays claims to being in the process of drawing on new knowledge and insights that will allow a proper scientific study of education: “Today, the world is in the midst of an extraordinary outpouring of scientific work on the mind and brain, on the processes of thinking and learning, on neural processes that occur during thought and learning, and on the development of competence” (Bransford et al. 1999, 3). Neurophysiology will help us, many now expect, to more effectively bring science to play on educational phenomena.
In general the lack of evident success of attempts to shape and apply scientific methods to education has been put down to the great complexity of the phenomena we deal with. The physicist, in this view, has it relatively easy, dealing with the common properties of natural objects whereas we have to try to get some grasp on the unpredictable contingencies of particular human choices and behavior. More basically, the knowledge generated in the physical sciences involves priority being given to the general rather than to the contingent and the particular.

In this article I want to suggest another reason why we have such difficulties. I want to argue that we are often failing to generate useful results from our studies, not because our phenomena are so complex, but because we often wrongly assume that we are doing empirical research and consequently are using inappropriate methods. In particular I will argue that much supposed empirical research isn’t empirical in the sense researchers assume, and its results are vitiated by that false assumption. I will suggest that this problem is more widespread than might initially appear to be the case.

My argument draws significantly on the work of Jan Smedslund (1979). His argument about what he called the confusion of “the analytic and the arbitrary” gained some attention in the late 1970s and early 1980s, but his radical claim that nearly all research that was claimed to be empirical could be re-described in terms of sets of “common-sense” theorems failed to gain many adherents in North America. He suggested that psychology and the social sciences in particular might be better seen as based on geometry, as an elaboration of proofs from theorems, than as derived from the methods of the physical sciences. It must be admitted that his solution to the problem he identified was too arcane for most to follow, or accept that it could indeed replace empirical research in the social sciences.

His critique on the other hand did present a challenge. A slight oddity of the evaporation of the influence of his argument on North American educational psychology was due in part to his having used as a demonstration piece for his claims the work of Bandura on “self-efficacy.” Smedslund (1978) claimed that all Bandura’s findings could be derived by inferences from an elaborated set of his “common-sense” theorems—that is, Bandura didn’t need to have done any of his research, he could have found out all the things he claimed to have secured just by thinking clearly. Bandura responded to Smedslund’s claim in a way that might reasonably have been seen as inconclusive (1978). But, when I would raise Smedslund’s argument with educational psychologist colleagues later, I was confidently told that “Bandura had demolished it.” This seemed to me a highly disputable view of the arguments, which I don’t want to rehearse here. I thought Bandura’s response to Smedslund’s very complex arguments was accepted too easily by those who might have felt some threat from the Scandinavians’ work.

What I would like to do here, then, is resuscitate a key point in Smedslund’s work that seems to me not to have been adequately appreciated, and also to put it in a somewhat different context. I hope I may be forgiven for beginning with a personal, and slightly embarrassing, anecdote that helped me to clarify the problem I mention in the title.

Researching stories

Like nearly everyone in Education, no doubt, I have been interested in why children learn some things well and remember them with enthusiasm while having great difficulty learning other things that seem no less complicated to learn on the face of it. One fairly obvious difference between the two conditions seemed connected with how well or otherwise certain knowledge engaged children’s imaginations. Not having had training in empirical research, I thought I would examine in some detail the kinds of things that seemed most readily to engage students’ imaginations and see whether I couldn’t derive some principles about successful learning from my analyses.

One of the things I began to focus on was the stories that children seemed to be most strongly engaged by, as well as games and some other topics. But for present purposes, I’ll stick with stories. It became clear that the kinds of stories that engaged children changed as children grew older, and were somewhat different in different places, and for children with different backgrounds. Even so there seemed to be some common features in these stories.

What emerge from my analyses (which drew on poetics, linguistics, and other branches of study that offered clues to how and why stories worked) was a set of principles that helped to account for what made these stories engaging. I then took some of these principles and designed from them a planning framework that teachers could use to design lessons or units of study in math, science, social studies, etc. The idea was to build into the frameworks a way of using the principles that helped account for stories’ engaging power so that these principles could be used to make curriculum materials engaging to students’ imaginations. The frameworks were a kind of poor cousin to those derived from Ralph Tyler’s model (1949).

I worked with teachers, and showed them how they might use the framework and principles, and would give talks to pre-service teachers at my own institution. There was enough interest in it that I wrote a short book about it (Egan, 1989), with a variety of examples about how the framework might be applied in different curriculum areas. (The book took just over a month to write, and helped me discover another of Murphy’s laws: The amount of time and care one spends writing a book is inversely proportional to the number of copies it sells. Teaching as Story telling has outsold all my other books put together.)

As you might imagine, I was very gratified when I would receive messages from teachers telling me that they had found the framework and principles really helped them, and that the children had learned well and enthusiastically, etc. (No doubt I haven’t heard from the teachers who had different results.) After a while, I received messages from teachers who wanted to give presentations about the principles and framework at a professional development day. Sometimes these teachers would have been asked by their supervisor of curriculum or district administrator what was the “research base” for their presentation. They had learned in their own professional training that a “research base” was required for something to be reliable. And by research base it was clear they meant empirical research results.

In response to the teachers or administrators asking me what was the research base to “teaching as story telling” I had to tell them there wasn’t one. I just made it all up. This was commonly received with the kind of shocked silence you might expect if you’d confessed to a preference for some exotic sexual practices.
But as a “research base” seemed almost a prerequisite for some people to take the principles and framework seriously, I decided I should do some research. So I did, over a number of years in a few local schools. And did the framework work? Well of course it worked, but I also had the experience every researcher has: that it worked well with some children some days, and with others less well, and with some hardly at all on one day and wonderfully well a day later, and so on. In general, it seemed to work wonderfully well for most children, but there was the usual array of variations in performance—both among the children and the teachers. So, what was I to do with the results?

Well this was complicated by two factors. First, I was no more confident at the end of this process than I had been at the beginning about the value of the framework and principles. Second, during this period I had been doing a lot of reading in the history of educational research through the twentieth century for a book I was then working on. What was most striking about this research literature was that it was full of studies like mine. That is, someone had an idea about some method of teaching, did a study, and showed wonderful results. We have mountains of data about how to successfully teach every subject in the curriculum for every grade level in a vast range of different conditions.

Everyone knows this, of course, but it was dispiriting nevertheless to read this material extensively. I mentioned above the significant amount of secondary literature that has been generated trying to account for why this massive “research base” seems not to be producing the kinds of results in ever improving education an outsider might reasonable expect it should produce. The currently popular answer, of course, is that it wasn’t scientific enough. We need, we are told, more rigorous scientific studies to deliver knowledge about how best to teach and how best learning can occur.

Let me conclude my sad story about researching the educational uses of story-based planning frameworks before coming back to the issue of the requirements of science for the study of education, and whether there might not be more hindrances in our way than can be overcome by more rigorous forms of the kind of research that Herbert Spencer set us on to do.

The crucial defect that I identified in my research was of the kind Smedslund’s analyses helped me to see. I was trying to discover whether teaching performed with my story frameworks increased children’s learning. The problem was that my question, and the methods used to test it, confused what I am calling an analytic component with the empirical component. That is, a hidden part of my research question might be re-stated as something like: “Do features of stories that engage children’s imaginations engage children’s imaginations?” Remarkably, I discovered that they did. But as a finding this was something like discovering empirically that all the bachelors in Chicago are unmarried males. While one could design empirical studies—run surveys, etc.—to establish the unmarried status of Chicago’s bachelors, it would be a tad futile. Nor would it make sense to say that people might have believed it in the past, but now we had demonstrated it scientifically. The relationship between unmarried status and bachelorhood is what we might call analytic; it is established in the meaning of the terms and can be derived from analysis of their meaning. What I want to suggest now is that it wasn’t just my own study that seems vulnerable to this problem, but many others too. I’ll leave it to you to decide how widely this writ might run.

The analytic and the empirical

Let’s look at how these two components work together, and often confuse us, in empirical research. In the case of my research, I have identified the analytic component above. It is not an empirical question whether stories engage children’s imaginations—if we spend some time on it, we can see that engaging imaginations and stories are not distinct things. (What kinds of stories engage different children at different ages are distinct and we might expect an empirical study to help us enlighten that—though I think we might get less than we think from the empirical part, and find that the analytic part even of that question delivers more than we might expect—but we’ll come to that later.) If we found that stories did not engage children’s imaginations, we would assume there was something wrong with the stories or with the children. The engaging power of stories is tied up with stories’ relationship with language and how our languaged minds engage the world (Egan, 1997). We can’t spell out all these ties, but that doesn’t mean they will yield to empirical study, because they are tied up primarily with meanings. The meanings of “story,” “imagination,” and “children’s minds” are connected before we do any empirical research to discover their connections. That is, before we begin our empirical research, we are guaranteed positive results because of the hidden analytic ties among the meanings of the terms that form the bases of our study. We also will have the usual variations in our results due to particular children’s inattention because of hunger, or a game they are looking forward to, or their irritation with some other child, and so on. That is, there is a huge range of genuinely empirical matters that will influence our results. Smedslund argued that the positive results of empirical research in the social sciences resulted from the hidden analytic component guaranteeing a total positive connection, while the genuinely empirical elements reduce that positive connection. We try to control for the confusions of the empirical component by having large samples, control groups, etc. But the analytic component generalizes absolutely and the empirical component doesn’t generalize at all. Let me try to clarify this with a simple example.
This is an example that resulted from a colleague rejecting Smedslund’s arguments on the ground that, while Smedslund’s analyses of particular pieces of research were convincing, they worked because he had chosen bad research. The few cases of educational research on which I had used Smedslund’s critique were similarly discounted as due to the research being faulty in the first place. That is, my colleague agreed that all those pieces of research we had chosen more or less at random were vulnerable to the critique of their confusing empirical and analytic components but he believed these were rare cases. I asked him to describe some finding of empirical research on learning that was purely empirical and would not be vulnerable to the Smedslund’s critique. The example he proposed was the finding that ordered lists are learned easier than random lists.

Some years ago it was common to perform research on children’s abilities to learning randomly ordered numbers. Indeed, it was the kind of research that led to confidence that the above generalization about learning ordered list was based. I took the case of children’s learning and memorizing seven digit numbers. How could such a study be vulnerable to Smedslund’s critique?

First, we should note that in the studies from which the secure generalization was derived considerable differences were noted in individual children’s abilities to learn and memorize the various assigned numbers. We expect this. And some children’s ability to learn some random numbers differed from their ability to learn others. We expect this too. In one case the randomly assigned number is a child’s telephone number, in another it is the numerals of the child’s birthday, and so on. But sufficiently large samples neutralize such irregularities, and we accept the significant variability in results as a part of the problems that are inevitable in dealing with human subjects.

Perhaps you have by now been alerted to look for analytic connections among the terms of the research question. You will perhaps suspect that what is ordered is not entirely disconnected from our ability to learn. That is, what we mean by ordered is connected with what it is easier for us to learn. The analytic component concerns the conceptual ties between order and learnability. Our minds’ ability to learn and our notions of what counts as ordered are connected before and regardless of whatever research shows about their relationship. If students in our experimental group learned random lists more easily than ordered lists, we would have scanned the lists for some order we had failed to notice. On discovering that, in one case, the supposedly random number was the student’s telephone number, we would feel satisfied that we had accounted for the anomalous result. What we mean by order is conceptually connected to what we can more readily recognize and learn. No experiment is required to establish the generalization.

In our experimental group, however, we will have had some variability among subjects’ learning and memorizing the random numbers. The telephone coincidence is just one dramatic anomaly, but then there will be the case of the numbers that are, for another student, his mother’s birth date, and the one that is only a digit different from another student’s bank account code, and so on. Certainly not all random numbers will look equally random to all subjects. But these findings are arbitrary. We control for them by having large samples and other methods. What we cannot do, of course, is generalize from these anomalies. We cannot generalize about that student’s ability to learn and memorize random numbers or about other students’ ability to learn and memorize those particular numbers.

So in the case of this research we have an analytic tie that guarantees that we will establish a strong positive correlation both between orderedness in the lists and the ease of learning and memorizing and between randomness and difficulty. We have, in addition, a range of arbitrary elements that will have ensured that what counts as ordered for one subject will seem random to another, and a variety of indeterminable arbitrary contaminants in our data. By confusing the two, by failing to distinguish the analytic component from the arbitrary components, we treat the results of our study as an empirically established connection. The analytic component, however, generalizes absolutely. The arbitrary elements cannot be generalized at all. We do not need an experiment to establish the analytic component. And the arbitrary elements, which are genuinely empirical, cannot be generalized.

Earlier A. R. Louch (1966) had shown how much research in psychology had similar defects. He began with the example of Edward Thorndike’s “law of effect,” which claimed to have established that people choose to repeat behaviors that have pleasurable consequences. Louch pointed out that the connection between repeating behaviors and expecting pleasurable consequences is not conceptually independent. The two behaviors are analytically tied: what we mean by choosing to repeat behaviors is tied up with what we count as pleasurable consequences. Louch further noted that E. R. Hilgard’s list of findings firmly established by psychological research were similar in kind. Hilgard’s first proposition was that “brighter people can learn things less bright ones cannot learn” (1956, 486). But what we mean by brightness involves the ability to learn more. Or take a more recent example. In the How People Learn project the aim has been to focus on findings that “have both a solid research base to support them and strong implications for how we teach” (Donovan et al. 1999, 12). The basic principles derived from carefully applying these criteria include the finding that: “To develop competence in an area of inquiry, students must (a) have a deep foundation of factual knowledge, (b) understand facts and ideas in the context of a conceptual framework, and (c) organize knowledge in ways that facilitate retrieval and application” (12). But (a), (b), and (c) are definitional of what we mean by competence in an area of inquiry. Empirical research could not have established that one could be competent in an area of inquiry without deep factual knowledge (and how deep is “deep”?), or without understanding facts and ideas in the context of a conceptual framework, or while organizing knowledge in ways that hindered retrieval and application. It may prove of practical value to spell out the meaning of competence like this, but the spelling out could have been done without the empirical research that is supposed to have established these conditions of competence.

If the rock is the problem of the role of the analytic, the hard place for research in education is the arbitrariness of genuinely empirical findings.


It’s not as though we haven’t been warned enough. In “The Historical meaning of the Crisis in Psychology” Vygotsky (1997, p.3) wrote:

A concept that is used deliberately, not blindly, in the science for which it was created, where it originated, developed, and was carried out to its ultimate expression, is blind, leads nowhere, when transported to another science. Such blind transpositions, of the biogenetic principles, the experiment and the mathematical method from the natural sciences, created the appearance of science in psychology, which in reality concealed a total impotence in the face of studied facts.

Having scientific methods, that is to say, is only half the battle; the methods have to be appropriate to the phenomena they are used on—and that’s the half that causes us problems with regard to education. The most celebrated statement of the problem, referring to psychology, is: “The existence of the experimental method makes us think we have the means of solving the problems which trouble us; though problem and method pass one another by” (Wittgenstein 1963, 232).

The reading that brought me to the conclusion that my story-based research was fruitless is something I recommend to everyone involved in current educational research. We tend to focus on recent work and our hopes for its results in practice. We make changes in our methodology in response to evident lack of success of our predecessors, or we think we do. Maybe we should consider it a genuinely empirical question to ask whether empirical research can bring about improvements in education of the kind promised for 150 years. The evidence is not very evident.

I have argued that a subtle but powerfully disabling conceptual problem is evident in a significant amount of empirical research on educational phenomena. This seems worth discussing at a time when massive expenditures on ensuring that the kind of research that seems to me most vulnerable to this critique is being proposed as the only reliable solution to the practical problems of modern education. There are grounds to doubt the good sense of this move. Walking faster with improved style really doesn’t help if you’re going in the wrong direction.


Bandura, A. (1978). On distinguishing between logical and empirical verification. Scandinavian Journal
of Psychology, 19, 97-99.
Bransford, John D., Ann L. Brown, and Rodney R. Cocking, eds. 1999. How people learn: Brain, mind, experience and school. Washington, D.C.: National Academy Press.
Donovan, Suzanne, John D. Bransford, and James W. Pellegrino, eds. 1999. How people learn: Bridging research and practice. Washington, D.C.: National Academy Press.
Egan, Kieran. 1989. Teaching as story telling. Chicago: University of Chicago Press.
Egan, Kieran. 1997. The educated mind: How cognitive tools shape our understanding. Chicago: University of Chicago Press.
Louch, A. R. 1966. Explanation and human action. Berkeley: University of California Press.
Smedslund, Jan. 1978. Bandura's theory of self-efficacy: A set of common-sense theorums. Scadinavian Journal of Psychology, 18, 1-14.
Smedslund, Jan. 1979. Between the analytic and the arbitrary: A case study of psychological research. Scadinavian Journal of Psychology, 20, 101-102.
Spencer, Herbert. 1911. Essays on education, etc. Introduction by Charles W. Eliot [1910]. London: Dent.
Tyler, Ralph. 1949. Basic principles of curriculum and instruction. Chicago: University of Chicago Press.
Vygotsky, L. S. 1997. The collected works of L. S. Vygotsky. Edited by Robert W. Rieber and Jeffrey Wollock. Vol. 3. New York: Plenum.
Wittgenstein, L. 1963. Philosophical investigations. Translated by G. E. M. Anscombe. Oxford: Blackwell.

Return to Home page