(from Through the Looking-Glass and What Alice Found There, 1872)
`Twas brillig, and the slithy toves
Did gyre and gimble in the wabe:
All mimsy were the borogoves,
And the mome raths outgrabe.
"Beware the Jabberwock, my son!
The jaws that bite, the claws that catch!
Beware the Jubjub bird, and shun
The frumious Bandersnatch!"
He took his vorpal sword in hand:
Long time the manxome foe he sought --
So rested he by the Tumtum tree,
And stood awhile in thought.
And, as in uffish thought he stood,
The Jabberwock, with eyes of flame,
Came whiffling through the tulgey wood,
And burbled as it came!
One, two! One, two! And through and through
The vorpal blade went snicker-snack!
He left it dead, and with its head
He went galumphing back.
"And,
has thou slain the Jabberwock?
Come to my arms, my beamish boy!
O frabjous day! Callooh! Callay!'
He chortled in his joy.
My goal in this series of articles is to develop a model-based description of grade-school arithmetic. In this article I will try to explain what I mean by that statement and why I think such a description might have value. My primary motivation for this project is to offer alternative ways of understanding elementary arithmetic in the hope that these ideas may lead to improvements in the elementary school mathematics curriculum. A secondary motivation for the project, but a close second, is that I think the topic is interesting and even fun.
The most difficult problem for me in writing this material is deciding who my reader is and in particular how mathematically sophisticated that reader is and thus how much detail he or she wants and can tolerate. The problem is complicated because even though my reader is certainly an adult, I want to give a sense of how the material might be presented to children. As I read over what I’ve written so far, it’s clear that I’ve been inconsistent in my point of view. This is particularly true in connection with proofs. While the approach to arithmetic I have taken here may not be the usual one, it is no less “rigorous” or “mathematical”. So while my goal is to tie theory to experience, the formulation of that experience is still a theory requiring definitions and principles and theorems and proofs and the very fact that the approach I have taken is different has undoubtedly caused me to adopt a higher degree of rigor than is probably necessary or even tolerable to most readers. I beg forgiveness for these failings but I hope there will be enough content in what I have presented that some readers will find these obstacles worth overcoming.
I will begin by trying to explain what I mean by a model-based approach to a mathematical theory. Theoretically, mathematics could exist in a universe that was completely empty except for an entity that could carry out mathematical reasoning. This entity could construct and reason about mathematical theories without the need for any interpretation of those theories in reality or even imagination. In its purest form, a mathematical theory is simply a linguistic object. Like any language, it is constructed from some collection of “words”. The grammar of the language designates the roles of various classes of words and how the sentences of the language are formed. Some words like “0” and “1” or “bandersnatch” and “jabberwock” are designated as “nouns”. We generally think of nouns as symbols which denote objects but of course in a completely empty universe there would be no objects for nouns to denote and so to our lonely mathematician “noun” is just a designation for some particular collection of words[1]. Other words like “+” and “*” or “gyre” and “gimble” and “mimsy” and “frumious” are assigned roles which combine nouns into phrases. Still other words like “is” and “was” and “>” and “=” can combine phrases into simple sentences like “1+1 = 2” and “Twas brillig”. “Connective” words like “and” and “or” and “All” are used to combine simple sentences into more complex sentences like “Twas brillog and he mome raths outgrabe” and “All mimsy were the borogoves” and “For all numbers x and y, x+y = y+x.”
Having created the language for its theory, the mathematical entity designates certain sentences like the ones given above to be “axioms.” We think of axioms and their “consequences” as “true” sentences of the theory but for a system in which terms don’t actually denote anything, the notion of a sentence being true is meaningless. The mathematician can choose its axioms on any basis it likes, perhaps the meter created by a particular sequence of terms, or no basis at all. The final component defining the theory is a collection of rules, called inference rules, which specify, in purely grammatical terms, when one sentence called the conclusion of the rule is a consequence of one or more sentences called the assumptions. An example of such a rule might be: “if A and B are sentences then A is a consequence of the sentence ‘A and B’.” Another might be: “if A and B are sentences then B is a consequence of the two sentences ‘A or B’ and ‘Not A’.” Note that the application of these rules requires no understanding of the meaning of the sentences. Application of the rules is simply a matter of pattern matching. Like the set of axioms, the mathematician in this universe without meaning can choose its inference rules arbitrarily. For example, there is nothing to prevent it from deciding that ‘Not A’ should be a consequence of ‘A’.
We can think of the theory as a game. The goal of the game is to prove theorems. The game is played by generating sequences of sentences in the language of the theory in such a way that each sentence in the sequence is either an axiom of the theory or is a consequence, according to some rule of inference of the theory, of sentences preceding the sentence in the sequence. Such sequences are called proofs and the last sentence of a proof is called the theorem of the proof. For example, if our mathematician takes as an axiom “The jaberwock was not mimsy” then depending on what rules it has allowed itself it might be able to prove “The jaberwock was not a borogove” with a proof that looked something like the following:
Here I have imagined unstated rules, the “A-Rule” and the “B-Rule” from which statement 2 follows from statement 1 and statement 4 follows from statements 2 and 3. The important thing to notice about such a proof is that it requires no conception whatsoever about what any of the terms mean. It could be a carried out by a computer and in fact, the process we have described is a good description of how computers operate. A computer generates sequences of patterns, which we can view as statements in a language, called the states of the computer and a computer is equipped by the manufacturer with a collection of rules, called its instructions, which allow it to generate a new pattern from an old pattern. A program is a analogous to a proof in that it defines the sequence of rules to apply which together with the axioms, called inputs in computer jargon, generates a sequence of patterns resulting in a final pattern, analogous to the theorem of the proof, called the output. The mathematical entity we have imagined could in fact be a computer.
While I don’t know of any mathematicians who view mathematics as a collection of purely formal procedures and formulas I fear that this is at least the first and often the last view that most students have. While lip service is given to the importance of discovery and modeling in mathematics education, as a practical matter, most of the time spent on what is called mathematics in grade school resembles programming far more than teaching.
There are serious problems with trying to program humans in the way that we program computers. One is that as a storage medium, the brain is far less stable than the semiconductor and magnetic media used in computers. The brain tends to forget or change the descriptions of rules and, even more interestingly, to create new ones completely spontaneously. As an example of rule creation, what second or third grade teacher has not seen students create a rule, I’ll call the “BTS-Rule”, for “bottom top switch”, that allows the bottom and top digits in a subtraction problem to be switched. For example,
1. Axiom: Problem is
34
- 6
2. BTS-Rule[1]: Problem is
36
- 4
3. SD-Rule[2]: Answer is 32
A computer capable of spontaneously generating rules like the BTS-Rule would be considered a major achievement; students capable of generating such rules are just considered poor students. The ability to generate new rules and change old rules is as essential to innovation as mutation and recombination are to evolution. We know how to build machines to carry out arbitrarily complicated procedures exactly. We do not know how to build machines that can innovate at anywhere near the level of human beings. Therefore, such creativity should be given high value. The problem is not that the student come up with and apply a new rule. The problem is that without any interpretation to assign meaning to mathematical terms, the student has no basis for deciding whether a new rule is valid. Most such innovations will be invalid but occasionally the student may discover a valid and even useful transformation. For example, the student might come up with a rule that allows him to add the same number, e.g. 4, to “top” and “bottom” and decide that 34 – 6 = 38 – 10 = 28. In the context of pure symbol manipulation, however, the teacher attempting to program correct subtraction processing has no choice but to stamp out rule creation and any other variation from orthodoxy.
Let me describe a contrasting situation. I have a ten-year-old daughter, who like many girls her age, is infatuated with horses. Over the past year, she has spent hundreds of hours drawing pictures of horses. Her initial pictures bore little resemblance to real horses. Over time however, her pictures developed depth and detail and now are quite recognizable if not quite completely realistic. You can see the evolution of her drawing in her notebooks. She continuously tries out various changes to her drawings to make them more realistic. Some changes help and some don’t. It is truly an evolutionary process. This process works because she has a basis for determining whether a change is valid or not. She has actually seen real horses and can therefore judge whether a given change is consistent with the appearance of real horses.
We can make an analogy between a drawing and a theory by viewing a drawing as a statement in a language in which the terms are the lines and other figures used to create the drawing, and the relationships between terms are defined by their relative positions. In an abstract drawing, like a purely abstract theory, there are no constraints on the “statement” other than the aesthetic sensibilities of the artist. But, for representational drawings like my daughter’s, where the drawing is intended to have a meaning, e.g. “this is what a horse looks like”, then the intended meaning determines the validity of the drawing. When we interpret a drawing as an object, we interpret the figures in the drawing as features of the object. This curve represents a segment of an ear and this line is a part of the leg and so on. The drawing is valid if the relationships of the figures in the drawing are the same as the relationships between the features they represent.[2]
The same process can be applied to interpret a mathematical theory. Thus, the terms of the theory are interpreted as objects in some system of objects and the relational and operational elements of the language are interpreted as operations on and relations between the objects in the system. A statement in the theory can thus be interpreted as a statement about the properties of the system of objects in the same way that a drawing can be interpreted as a statement about the properties of the visible features of a horse. Just as real horses impose constraints on drawings intended to be interpreted as horses, so a real system of objects constrains theories intended to be interpreted in that system of objects. Such an interpretation provides a basis for deciding whether a given rule or statement is valid. If we provide the student with an intended interpretation of a theory we can say to the student: “You’re perfectly free to be creative so long as your new rules or changes to old rules are valid under this interpretation.” (It still has to look like a horse!)[3] For example, we can say to the student formulating the BST rule, “That’s a very interesting idea. Let’s see if it’s true.”
In mathematics, an interpretation of a theory for which the theory is valid is called a model of the theory. This seems backwards from the usual use of the word model. While we might say that an artistic rendering of a horse is a “model of the horse”, the mathematical usage is analogous to saying “the horse is a model of the rendering.” Apparently, mathematicians take their theories more seriously than the intended interpretations of those theories! This point of view developed for a number of reasons. One was a realization that a given theory may have multiple and quite different kinds of models. In fact, by limiting the extent of a theory one can increase the range of possible models and thus the range of possible applications of the theory. Another reason was interest in exploring purely abstract theories with no obvious “natural” interpretations such as strange geometries, which contradict Euclidean geometry, or logics with values lying between true and false. Many such strange theories including the ones mentioned have ultimately found profound application.
In science and engineering the term model is generally used in the more conventional way. To a physicist the material universe is the central and in fact only object of study. Mathematical theories serve as models of the universe. The value of a theory is judged by how accurately it predicts the behavior of the universe. If the universe fails to satisfy some aspect of the theory then it is the theory rather than the universe that must change. The same remarks apply to a structural engineer modeling a building or an economist modeling currency markets.
The differing points of view of the mathematician and the scientist naturally correspond to two differing pedagogical approaches to teaching mathematics. In one, call it the mathematical approach, a theory is developed first and a model or models to which that theory applies are presented later. In the other, call it the scientific approach, a system or systems of objects of some kind are presented, and the theory is developed as a means of describing and making predictions about the behavior of that system.
In the one, theories precede models. In the other, models precede theories. To a mature mathematician or scientist, either approach may be acceptable. However, to a beginning student, who has no experience with mathematical theories, the mathematical approach can’t really make much sense. But maybe that’s OK. Maybe we can first train the students in a purely abstract language with abstract rules and procedures, and then later show the students how those abstractions can be used to answer questions about real systems. It is my impression that this is in fact the approach taken in most grade-school math curricula. A typical lesson in such a curriculum consists of teaching some computational technique and then presenting some “word” problems to which the technique is applied.
The problem I see in this approach is that it makes it seem
almost accidental that the theory could be applied to the particular set of
problems given. It misses the point
that the theory was designed to solve a general class of problems
including the particular examples given.
It misses the point that other such theories can be designed to solve
other classes of problems. While it
provides the students with a basis for solving a set of routine problems, it
gives no insight into how one develops theories and techniques to solve new
problems. Knowledge of computational
techniques is useless without a general understanding of when and how
techniques can be appropriately applied to solve problems whose form may be different
from those seen before. This process of
matching theory to interpretation is called modeling.
As an example consider the development of plane geometry. An immense number of real world problems have concerned the means of construction and properties of roughly two-dimensional figures. These include problems in agriculture, construction, manufacturing, and art. These problems have been known and studied since antiquity. For example, an approximation to pi appears in the Hebrew Bible in connection with the construction of a temple. Eventually these problems were abstracted and represented in an idealized system of “points” and “lines” and “figures” which today we call the Euclidean plane. In this idealized system, points have no extent at all, lines have no width but extend indefinitely and are perfectly straight, circles are perfectly round and continuous, and the plane itself is infinite in all directions, uniform, and perfectly flat. The Euclidean plane has no exact representation in the physical universe of course[4] but to the extent that a real world system approximates this idealized system, truths about the Euclidean plane can be used to obtain results which approximately describe properties of the real system.
The theory of Euclidean geometry was developed to understand the properties of the Euclidean plane. This theory includes terms for “points”, “lines”, “segments”, circles”, “sectors” and “shapes” along with operations for constructing figures and relations of “equality”, “congruence”, and “similarity”. The mechanism which connects this theory of geometry to objects in the Euclidean plane is called an interpretation. Terms for “points” in the language are interpreted as actual points in the plane, “lines” as lines, “circles” as circles and so on. As the diagram below indicates, an interpretation is a kind of map from terms and sentences in a language to the elements and assertions about some system of objects. There’s nothing esoteric about this. This is the general way in which we assign meaning to terms in a language.

Operator symbols in the theory are interpreted as procedures for constructing objects. Thus an operator like “Circle(segment,point)” which is intended to denote the circle whose radius is the length of the given segment centered at the given point could be interpreted as a compass construction as illustrated below.

Again there is nothing mysterious about this. An expression like “make a one pound loaf of rye bread” is interpreted as a process which involves gathering ingredients and mixing and kneading and baking and results in a loaf of bread.
The relation symbols of equality, congruence, and similarity can be interpreted in our model as tests. For example, a sentence of the form “ term1 is congruent to term2 “ where term1 and term2 are triangle terms, could be interpreted as a procedure that first constructed the triangles according to the interpretation of the terms and then tested to see whether the triangles are of exactly the same dimensions and shape. The test for the “sameness” in shape and size could be interpreted by copying one of the triangles and determining whether it is possible without deforming the copy in any way to fit the copy exactly on top of the second triangle as illustrated in the sequence of diagrams below.

True!

Thus simple sentences in the theory are interpreted in the model as assertions that may be true or false. This is no different than interpreting a statement like, “the pan is hot”, as an assertion about the outcome of an experiment, “touch the pan and determine whether there is pain.”
Our interpretation of Euclidean two-dimensional geometry in terms of shapes in the plane illustrates the general way in which theories are interpreted in a system of objects. Simple terms are interpreted as specific objects in the system. Operators are interpreted as constructions that construct objects from component objects. Simple sentences are interpreted as an assertion that the outcome of a test will be true and complex sentences are interpreted as assertions about the outcome of a collection of tests. As stated earlier, an interpretation of a theory in a system of objects is said to be a model of the theory if the interpretation of every theorem is a true assertion.
Two obvious questions arise. One is, “how can we know that a given interpretation of a theory is a model of the theory?” The second is, “If we have a model why do we need or want the theory? Can’t we just answer questions and solve problems by direct testing and construction in the model?”
Let’s start with the problem of showing that a given interpretation of a theory is in fact a model of the theory. Our problem is to show that every theorem of the theory is true under the interpretation. One approach would be to enumerate all theorems of the theory and verify them one by one. The problem with this approach is that there may be infinitely many theorems to verify. For example for every line segment in the plane there is a theorem of Euclidean geometry that asserts that that line segment is part of an equilateral triangle. It would not be possible to verify that directly for every possible line segment.
Fortunately, we can simplify the problem. We start by verifying the assertion corresponding to each axiom of the theory. Suppose that’s been done. We then verify for each inference rule that whenever the interpretation of the precursor sentences for the rule are true then the interpretation of the conclusion of the rule will be true. This property of rules is called soundness. For example consider the rule that concludes from A from ‘Not B’ and ‘A or B’. Suppose the assertions associated with ‘A or B’ and ‘Not B’ are true. The assertion of ‘A or B’ is that the interpretation of A is true or the interpretation of B is true. The assertion of ‘Not B’ is that the interpretation of B is false. So it must be the case that the interpretation of A is true.
Now suppose we have shown or know by some other means that the axioms of the theory are true in the model and the rules of inference are sound. We can then show that every theorem will be true in the model. Recall that theorems are the concluding sentences of proofs and a proof is a sequence of sentences in which every sentence is either an axiom or the conclusion of some rule whose precursor sentences have appeared earlier in the proof. We claim that every sentence in a proof has a true interpretation in the model.
Let some proof be given. The first statement in the proof must be an axiom since there are no preceding statements in the proof and so, since axioms are known to be true in the model, the first sentence is true. Now, by the definition of proof, the second sentence is either an axiom or follows from the first sentence by a rule. If the second sentence is an axiom we know by assumption that its interpretation is true. If, on the other hand, the second statement follows from the first statement by a rule then since rules are sound and the first sentence is known to be true in the model, the second sentence is true and so both the first two sentence are true.
Now consider the third sentence. Either it’s an axiom or follows from a rule whose assumptions are among the first two sentences. If the third sentence is an axiom then by assumption it must be true. If on the other hand it follows from a rule whose assumptions are among the first two sentences, then since rules are sound and the first two sentences are true, the third sentence must be true in the model and so the first three sentences of the proof must be true in the model.
Now you should be able to see how to prove that the fourth sentence of the proof must be true in the model. Hint: the fourth sentence is either an axiom or the conclusion of a rule whose axioms are among the first three sentences and we know that the first three sentences are true in the model. Having shown that the first four sentences of the proof are true we can use the same reasoning to show that the fifth sentence is true and so on, for however many sentences are in the proof. So every sentence in the proof is true and so in particular the last sentence of the proof, which is the theorem of the proof, is true in the model.
We have thus shown that for any proof of the theory all statements of the proof, including the last, are true in the model and so all theorems are true in the model.
By the way, the kind of argument we have just given, which in effect depends on an and so on phrase, is called an inductive argument. The idea is that the truth at each step is induced from the truth of the previous steps. This kind of argument is sometimes compared to a falling line of dominoes. You knock over the first and then all the dominoes fall because each is knocked over by its predecessors.

![]()
![]()
![]()
![]()
![]()
![]()
1 2 3 4 5 6 7 8 …
Thus, in the argument above, the truth of the first statement guarantees the truth of the second statement which guarantees the truth of the third statement and so on.
In order to apply the result above we need to know that the axioms of Euclidean geometry are true in the Euclidean plane. The first of Euclid’s axioms, or postulates, as he called them, for plane geometry states that, “Between any two distinct points there is a [unique][5] straight line.” Why should I believe this postulate? You might ask, “Well why don’t you prove it?” The response is, “in what theory?” This is an axiom of Euclid’s theory. In order to give a proof of this statement I would have to have another more primitive theory with its own axioms and logical rules and I would then still have the problem of verifying these more primitive axioms. Of course I could try to use a still more primitive theory to prove the primitive axioms and so on but I can never escape the need at some point to accept certain statements without proof. (That’s after all what we mean by an axiom.)
Now the fact that we accept a statement without a mathematical proof does not mean that we must accept it without evidence and conviction. Such evidence and conviction for an axiom can only result from experience. So how do I develop conviction about Euclid’s first postulate? I can’t literally do experiments in the Euclidean plane so I will have to gain experience from an approximation to the plane and extrapolate. Perhaps I sit at my desk with a piece of paper, a pencil, and a straightedge. I make two distinct dots on the paper and use the straightedge to draw a line between them. I try some other dots. I keep one dot fixed and move the other to different distances and directions from the first. I note that I can keep the straightedge on the first dot and continuously rotate the direction of the straightedge. Perhaps I draw a dot that’s too far from the first dot for my straightedge but I convince myself that a theoretical straightedge would connect the two. Further, I try different parts of the paper and imagine the paper extending infinitely and moving my experiments to arbitrary parts of it until I believe that the behavior I have seen would apply uniformly no matter where I am on the plane. I do note that, in contradiction to the axiom, I can draw multiple lines that touch the two dots but I realize that this is because my dots, unlike true points have a non-zero extent. By experiment I notice that as the size of the dots decreases or I am stricter about forcing the lines to go through the exact center that it becomes more difficult to draw non-coincidental lines between the two. Eventually I become confident that if the dots were infinitesimally small all lines between the two would coincide.
Euclid gives five postulates concerning plane geometry. Two through four: a finite line can be extended to a straight line of given length, that a circle centered at any given point of any given radius exists, and that all right angles are equal can be verified experimentally in the manner described above for verifying the first postulate. The fifth postulate, which is called the parallel postulate, seems a little more problematic. I won’t state the postulate but a consequence of it is that two distinct lines through which a single line perpendicular to each can be drawn will never intersect no matter how far they are extended.

Unlike the first four postulates this one makes a statement about infinity. Put another way, while I can experience instances of the first four postulates, a line drawn between points, a circle about a given point, and so on, I can never fully experience an instance of this axiom because it would require extending the two lines infinitely in order to make sure they don’t intersect and that of course I can’t do. I certainly can’t construct contradictions to it on my piece of paper and apparently such lines don’t get any closer but predicting what happens in the infinite seems a “stretch”. I only bring this up to point out that there is a kind of negotiation that goes on between theories and models and applications of deciding how far it is safe to carry intuitions as axioms. Certainly modern physics has shown that a lot of intuitions one might have about reality are not valid. This is an essential lesson one has to learn in order to apply mathematics successfully.
Now let’s consider the second question raised earlier. Why do we need theories at all? Why don’t we just test and verify directly in models. In some cases performing the experiment directly would be too expensive or impossible or even dangerous. We might prefer using a theory based on axioms believed from experience regarding hot things to determine that the pan is hot rather than touching it. Predicting the effects of an 8.0 earthquake on Los Angeles is another example or the structural integrity of a nuclear reactor design or the effects of certain economic policies on the world economy are other examples where theory is to be preferred over test. Some problems require searching through a large collection of alternatives for a solution having some property and it may not be possible to perform the experiment more than once. A bomb expert defusing a bomb would certainly want to exploit as much theory as he can muster before deciding what wire to cut.
It can be argued that the advent of powerful computing devices allows the use computational models in place of real world models where real world models are not feasible or simply too expensive. For example I have heard that weapons makers using super computers are able to simulate processes as complex as nuclear explosions with great accuracy. Computer simulation is a powerful method for extending the application of mathematics but it does not eliminate the need for mathematical theories.
Every aspect of the design, programming, and analysis of a computer simulation is in fact an application of theory. Thus, the simulation itself is an exercise in theory, even if carried out on a computer. Further, evaluating the results of such simulations requires deep theoretical analysis because the computer simulation itself introduces an extra level of approximation to the modeling process. This is illustrated in the two diagrams below. In the first diagram, the application problem is translated into an approximately equivalent problem in the idealized model. (We using dotted figures to indicate where correspondences are only approximate. The theory is used to reason about the idealized problem and the results are then translated back to the application.



In the second diagram below, we again translate the application problem into the idealized model. We then translate from the idealized model into the computer model and use the computer approximation of the theory to derive results, which are then translated to the application problem.

Postulate
I. Postulate
II. ….
![]()


Theory
It is also impossible to use direct experiment with or without a computer to validate properties where an infinity of cases or even an exceptionally large finite number of cases must be tested. As an example of the latter consider the problem of validating the software on a system consisting of thousands or even millions of interconnected computers each responding to ad hoc input from its local environment. The number of such possible states of such a system is astronomical. It would be impossible to validate the correct functioning of such a system through direct testing. Such systems exist today and will become pervasive in the future.
As illustrated above, the application of mathematics to solve real problems involves determining or creating an appropriate mathematical model, translating the terms and assumptions of the application into the language of the mathematical model, reasoning about the model in a corresponding theory, and translating results about the model back into predictions about the behavior of the real system. This is true for the kid dividing a candy bar as well as the physicist determining the structure of matter. Without an understanding of the models of a theory and the ability to formulate real problems in terms of those models the theory is practically useless. Further, the process of reducing a problem to a mathematical theory and interpreting the results of a theoretical derivation is often messier, more complex and more subject to error than the derivation itself.
As I have said, I believe that to most grade-school students mathematics is a collection of terminology and formulas to be memorized. In fact, I believe this is the case not only in grade school but throughout the educational system. The fact is that very few people have anything beyond a very superficial understanding of mathematics. To most people, mathematics is a collection of formulas and algorithms. Mathematics as the underlying framework of science, as the system in which we formalize and reason about the external world, is not widely appreciated or understood. This remark even applies to most engineers I’ve met for example. It also applies to most elementary school teachers. Given that, it is not surprising that there is very little real mathematics in the grade school curriculum.
One possible conclusion to draw from this, and I’m not completely sure it’s incorrect, is that mathematics beyond formulas and algorithms isn’t that important for most people. In the distant past when mathematics in any form was rarely needed, mathematics was the province of experts who were consulted when a problem requiring calculation arose. As commerce increased in importance and transactions requiring calculation became more frequent, it became necessary for a larger number of people to learn to carry out the procedures. Now the variety of problems having to do with basic commerce was relatively small so that rather than teach a general theory to account for these few models along with the reasoning necessary to formulate and solve problems, it was easier to just enumerate the methods of solution essentially as algorithms to be executed. Numerical calculation being a common element in these routines was taught independently as a kind of library of subroutines. It was enough to teach calculation because it was possible to teach and learn by rote all common applications. Ad hoc problems were still left to experts.
As science and technology become more pervasive, the variety of mathematical applications grew beyond the ability of any one person to enumerate and learn. This problem could be dealt with initially through specialized professions and education. Most people could still treat mathematics as formulas and methods of calculation but each person, depending on his profession, carried around a particular collection of such facts. Now as science and technology has continued to progress many professions have reached the point where even the methods specific to that profession are beyond the capabilities of a single person.
Fortunately, science and technology led to the development of the computer. Now the formulas and procedures for a given profession, or even lots of professions, can be embedded in a machine. The role of the expert, at least for common applications, is to choose the right procedure, input the appropriate data, and interpret the results. But of course now everyone can carry around all these mathematical procedures for all interesting professions (or download them off the internet when they become interesting.) Initially one could argue that knowledge was still required to choose the appropriate method and input the data and interpret the results. But as the power of the machines and the sophistication of the programs has grown this argument has become outdated. Mathematical methods are now embedded inside expert application programs which give precise instructions on how to measure and input data, determine what methods to apply and interpret the results for the user. (Look at what’s happened with tax preparation for example and even more impressively in the design of electronic circuits and mechanical systems.) Further, as more and more data is made available on-line, and intelligent devices with automatic sensors become pervasive, the need for humans to input data will be eliminated and even the knowledge of what applications to run when will be automated and every application with even a small amount of economic value will be available. And this is all happening now.
There seems to be a prevailing belief in the United States that our education system in general and mathematics education in particular is inadequately preparing students for an environment in which data and processing resources are universally available – the so-called information age. The claim is that other countries do a better job of teaching mathematics and as a result, our students will be unable to “compete in the global economy.” The theory is that our industries will be forced to relocate to other countries in order to have access to an adequately educated work force. This has led to an accountability driven curriculum based on standardized tests. Since standardized tests can only measure routine problem solving skills this has led to a curriculum that is focused almost solely on perfecting routine calculation skills. The goal seems to be to try to cram more formulas and calculation methods into younger and younger children.
The absurdity of this strategy should be obvious. The competition in routine calculation skills and problem solving isn’t with humans in other countries. The competition is with machines and I assure you the machines will win. John Henry had a chance against the steam hammer. The combined efforts of the entire human population of the earth couldn’t out-calculate a single standard PC available at Wal-Mart today. And, following Moore’s law, the machines become twice as powerful every eighteen months.
Not surprisingly, I do believe that the value of mathematical training will increase in the future. In almost every endeavor there will be far more data and analytic tools available. With these tools and data, a properly trained person will be able to create increasingly accurate models of whatever process is being carried out and thus do a better job of predicting and optimizing the behavior of those processes. In short, there will be far more opportunities to apply scientific methods in a deeper way to a wider class of activities. Society at large, businesses, and individuals can benefit from increased efficiency and more accurate predictions. Further the ability to create, integrate, and extrapolate from these more accurate models will lead to new solutions to old problems and entirely new opportunities to create value.
I don’t pretend that these remarks are very profound. All I’m really saying is that better data and analytic tools allow better science and science has proved to be a pretty good strategy, vs. say looking at animal entrails, for understanding how stuff works and making it work better. There will thus be more opportunities for individuals equipped with some degree of scientific training. Now when I speak of “science” here I’m not talking about earth shaking discoveries about the origin or makeup of the universe. Almost any system at all can be the subject of scientific analysis. Science is fundamentally not a collection of facts or even theories. It is a systematic method for analyzing, describing, and reasoning about systems ranging from the entire universe to my refrigerator. Mathematics is the language and reasoning system in which science is expressed and carried out.
As the domains in which science can be effectively applied expand, the range and variety of systems that will require mathematical models will expand with it. We cannot even begin to enumerate the applications to which a given mathematical technique might be applied. Thus, the practitioners in these areas will have to be equipped not only with the mathematical techniques, but also with the ability to connect mathematics to real systems in order to determine the appropriate use of techniques and analytic tools. In short, they will need to be skilled in mathematical modeling. For this reason, I believe that an understanding of models should be an integrated part of the mathematics curriculum even in grade school. I also believe that this approach is pedagogically superior.
To an expert, the term “mathematical modeling” brings to mind statistical methods and differential equations and the like, but every application of mathematics, even elementary arithmetic, is an exercise in modeling. If the solution to some real problem is computed by the “addition of 2 and 3” then 2 and 3 must each have an interpretation as some element of the problem. Further, there must be some operation involving those elements of the problem, which is modeled by addition rather than multiplication or some completely non-arithmetic operation. The process that interprets integers and integer operators in terms of the elements of the problem must ensure that the interpretation is valid. As we know, there are innumerable ways of misapplying arithmetic and the maxim about not adding apples and oranges doesn’t begin to cover the gamut. Thus, even simple applications of arithmetic can serve to illustrate the process of how we assign meaning to mathematics and the use mathematics to solve problems outside of mathematics.
One might imagine that systems that model elementary school arithmetic are quite natural and simple. Such is not the case. I refer the reader to Frege’s Foundations of Arithmetic[6] for a philosophical treatise on the problem of interpreting numbers and arithmetic. Suppose I’m trying to add up my pennies. I wish to apply arithmetic to this problem so I need to develop an interpretation of arithmetic in terms of my system of pennies. How shall I interpret “1” in this system? A dialog might proceed along the following lines.
A: “A penny.”
Q: “Which penny?”
A: “A single penny.”
Q: “But which single penny?”
A: “It doesn’t matter.”
Q: “OK, I’ve picked a bright one. Now what?”
A: “Now take another penny and use it to interpret 1.”
Q: “But wait a minute I thought the bright penny was 1?”
A: “Well, they can each be 1.”
Q: “But how do I interpret equality?”
A: “Any penny is equal to any other penny.”
Q: “But how do I tell them apart if they’re equal? If they’re all equal then do I just have one penny?”
The traditional way out of this mess is to consider sets of pennies rather than pennies as the interpretations of the natural number and to define an equivalence based on 1-1 correspondence between sets. We then try to interpret addition as the union operator on sets. But of course this doesn’t actually work unless the two sets are disjoint. Following this approach, in order to model the process of adding pennies we need to understand the concepts of set, 1-1 correspondence, equivalence, union, and disjointness. And this is just to interpret addition of non-negative whole numbers.
If we wanted to extend this line of interpretation to account for multiplication we would have to consider interpretations of numbers not as sets of things but as sets of sets of things and introduce a union operator on sets of sets. Subtraction introduces subsets and division introduces set factoring. And there is no obvious way to extend these kinds of models to fractions or negative numbers.
I believe that if we want to introduce true models into elementary arithmetic we are going to have to take a different approach. In this series of articles I will describe a different approach and demonstrate how it might be used in an extensive part of the grade-school mathematics curriculum.
[1] Yes, you’re right that his universe can’t really be empty. Besides the mathematician it must include symbols and rules defining theories and so he could create a theory about these symbols and rules and that “theory of theories” would have terms that actually denote things like symbols and rules and proofs and so on. While this has proved to be an interesting and productive line of thought I don’t think it’s too relevant to what I want to do in these articles. For an enormously elaborated discussion of this and related topics see Douglas Hofstadter’s book “Godel, Escher, Bach.”
[2] Of course, I’m asking the reader to “squint” in this discussion. I’m ignoring the problem of projecting three dimensions onto to two. Much more importantly, the manner in which the mind actually assigns meaning to a collection of figures seems infinitely more complicated than this description. Please bear with me. I’m trying to use art to help explain mathematics – not vice versa!
[3] By the way, I am most definitely not suggesting that we teach mathematics by throwing students into a sea of interpretations and expecting them to evolve millennia of mathematics on their own!
[4] It’s fair to ask to what extent the Euclidean plane actually exists as an independent completely defined object even if only in our minds. For the purposes of this discussion please accept Plato’s view that the idealized plane exists in some sphere of reality.
[5] Apparently Euclid didn’t actually state uniqueness explicitly but his proofs assume it so it’s usually added.
[6] Gottlob Frege, The Foundations of Arithmetic, translated by J.L. Austin. Harper & Brothers, New York. 1953