What follows is the first of a series of discussions about elementary mathematics and, in particular, the concept of number, based on the concept of machine. I started thinking about this topic when I was asked to help a student teacher I know prepare a lesson for a third grade class. The lesson attempted to introduce the students to equations and their solution. The lesson came out of the text and was assigned by the student teacher’s master teacher.
A typical problem to be solved was:
If 3 x X = 12 what is X?
The student teacher’s quandary was how to make this topic at least comprehensible and if possible even interesting to the students.
The method given in the textbook for solving the problem above was essentially the same as the one given in first year algebra:
· Divide both sides of the equation by three;
· Cancel the “3”s on the left hand side;
· Divide 12 by 3;
· Conclude that X = 4.
This procedure was described by a sequence of diagrams, each showing the next step in the transformation. There was no serious attempt to explain the meaning of the equation, the justification of the steps, or provide any motivation for the strategy of the solution. The intent was to introduce the method as a purely mechanical procedure in much the way that long division and other elementary algorithms are taught.
The authors of the text were clearly under no illusion that a typical or even advanced third grader would understand the meaning and justification behind an algebraic derivation. The level of abstraction involved in understanding the meaning of an equation and the idea that equations are objects, which can be transformed, is light years beyond anything the student has seen to this point. He has been encouraged to understand numbers and operations on numbers in terms of concrete models. For example, he might interpret a number as a pile of pennies and the operation of addition as a “putting together” of piles. This kind of modeling allows him to apply intuition gained from experience with real objects to the abstractions of arithmetic. But what kind of concrete interpretation can he hope to give to something like an equation or a variable?
I am in no way criticizing the authors of the text for introducing advanced topics in the form of formal procedures. There is a theory that theology follows from liturgy. In this approach students learn methods first and over time with the experience of using these methods and applying them to problems they will gain an understanding of their meaning. On the other hand, where it is possible to provide concrete interpretations one should do so. The immediate problem for the new teacher with visions of the Socratic method in her head was that teaching a mechanical procedure didn’t make for a very interesting lesson.
My goal in developing the content for the lesson was to find a way to present the basic concepts of equations and equation solving in a form that would allow the children to visualize problems in terms of what I’ll call intuitable objects and to understand the method of solution in terms of construction and manipulation of those objects. What I mean by (the admittedly awful word) intuitable is that the object can be comprehended to a sufficient level of detail that the student will have intuition about the behavior of the object and can then apply that intuition to understanding the mathematical properties of the object. While I wasn’t hoping to be able to give the student the equivalent of a pile of pennies, I did want to give him something to visualize that would help explain the mathematics. This goal led to the ideas that follow.
We start by considering how we might visualize a variable and an equation involving a variable. A variable is a kind of object, which can be assigned values. An equation is a statement which depending on the value we assign to the variable may be either true or false. This allows us to think of an equation as a kind of question answering machine. This machine accepts a number, which is to be assigned as the “value” of the variable, and the machine answers the question: “Is the equation true when this value is assigned to X.” We can picture such a machine operating as in the following diagrams.

.

This machine is an example of what we will call an Input/Output Machine. We call it that because if you put something, called the Input, into the machine the machine will produce something, called the Output, in response. For the machine above, the input is a number and the output is an answer: “Yes” or “No.” Using this machine, we can restate the original problem as: “Find an input value to assign to X which causes this machine to output the answer Yes.”
While this machine may help to visualize the form of the problem, i.e., among the different possible values for X find one, which makes the equation true, it still involves concepts, i.e. variable and equation, which are linguistic rather than concrete. We can simplify the problem in two ways. First we’ll get rid of the equation by considering the following machine.


We’ve replaced the equation, “3 x X = 12?” by the expression “3 x X”. This machine takes a number as input and produces a number, rather than “yes” or “no”, as output. In this case the input is a value to assign to X and the output is the value of the expression “3 x X”. In terms of this machine we can restate our problem as “Find an input value to assign to X which causes the machine to output 12.”
Next we note that we don’t even need the variable X to describe this machine. This machine takes whatever input you feed it and produces an output by multiplying the input by 3. We’ll just call this the “Times 3” machine. (We could also call it the “tripling” machine.)


Now our problem can be stated entirely in terms of the Times 3 machine: find a number which when input to the Times 3 machine produces output 12. For this visualization of the problem to be of help to us in solving this problem and its variants we need to develop our intuition and knowledge about input/output machines. For this purpose there is no need to restrict ourselves to input/output machines that accept numbers as input.
We have experience with many examples of objects or systems of objects that behave like input/output machines. A vending machine provides a good example. The input to the machine is money and the output is candy or whatever products the vending machine is vending. Real vending machines are a little more complicated than this of course because the input is usually the money plus an item selection, which we may make by pressing a button or pulling a knob. We could picture such a machine this way.

There’s no problem with an input/output machine having more than one input. There’s also no problem with input/output machines having more than one output. For example, we might have an additional output for change.
A factory is a kind of input/output machine. The inputs are the raw materials and the output is the finished goods. The factory below takes cocoa and sugar as inputs and outputs chocolate bars.

Of course, an input/output machine needn’t be made of brick or metal. A person baking cupcakes can be thought of as an input/output machine. The inputs are the ingredients (flour and so on) and the output is a cupcake. A leaf can be thought of as an input/output machine that takes sunlight, water, and carbon dioxide and outputs sugar and oxygen. An animal can be thought of as an input/output machine that takes oxygen and sugar as input and outputs water and carbon dioxide. You can even think of yourself putting on your shoes and socks as an input/output machine. The input is you in bare feet, a pair of socks and a pair of shoes and the output is you with your socks and shoes on.
The really interesting thing about input/output machines is that sometimes you can take two or more input/output machines and connect them together to make a new kind of input/output machine. As an example, suppose I sell machines and I’ve got two kinds of machines. One is a cupcake-making machine that makes cupcakes from mix. The other is a chocolate frosting machine that takes whatever you give it, (dogs, cats, kids, bats, balls, cookies, fruit, whatever!) and puts frosting on it.


One day I get a call from someone who wants a special machine that makes chocolate frosted cupcakes. I say, well we have a machine that makes cupcakes and we have a machine that puts chocolate frosting on things but I don’t have a machine that makes a chocolate frosted cupcake. Maybe the chief machine maker (the CMM) can figure out a way to make one? The CMM says, “No problem, we’ll take a cupcake maker and a chocolate froster and connect the output of the cupcake maker to the input of the chocolate froster and we’ll put the whole thing in a box and call it the Chocolate Frosted Cupcake Maker.”
So now I’m selling three machines: my cupcake maker, my chocolate froster, and my chocolate frosted cupcake maker. After a while I add a “Cherry Topping” machine to my product line. The cherry topper takes whatever you put in and puts a cherry on top.

Things are going along well and then one day I get a call from a customer who wants a machine that will make a chocolate frosted cupcake with a cherry on top. I don’t make one so I go to my chief machine designer (the CMD -- he’s got a crew now that actually builds the machine so he doesn’t have to get his hands dirty.) He says, “No problem.” and designs a machine consisting of a chocolate frosted cupcake-making machine connected to a cherry-topping machine with a box around the whole thing. So the contraption looks like this.
This connecting of suitable components is an essential part of problem solving. Complex problems are decomposed into simpler problems, which are solved separately, and then the solutions are composed to solve the bigger problem. It is also the essence of logical deduction and corresponds to the inference rule: “If A implies B and B implies C then A implies C.” In fact connecting two suitable machines amounts to the proof of a theorem. For example, in our first connection example above, we proved the theorem: “If there exists a machine that turns mix into cupcakes and a machine that turns things into chocolate frosted things then there is a machine that turns mix into chocolate frosted cupcakes.”
Once one gets the idea, one can visualize assemblies of connected input/output machines in many contexts. A manufacturing assembly line is exactly such a composition. Each station on the line takes the output of the previous station as input, carries out its operation on the assembly, and passes the assembly out to the next station. Our cupcake maker could be constructed in this way: A flour adder machine might take an empty mixing bowl and add a cup of flour; The bowl of flour would then become input to a milk adding machine; eventually, after all the ingredients have been added the bowl would be passed to a mixing machine; the mixing machine would pass the bowl of batter to a pouring machine; the pouring machine would pass a cupcake baking tin with batter to the baking machine; and so on.
Any multistep process can be visualized in the same way. The algorithms of elementary arithmetic are of this form. We can view the procedure for adding multidigit decimal numbers as being carried (unintended pun) out by an assembly line of machines. There is one machine for each decimal position, ones, tens, and so on. Each machine accepts two digits and a carry value, which can be 0 or 1, as input. The machine adds the digits to the carry; prints the ones place value of the result and passes the tens place value as the carry to the next machine. The fact that we can solve arbitrarily large problems by connecting together assemblies of uniform components is why place value systems of notation, like the decimal system, are so useful. Otherwise, we would have to learn entirely different procedures depending on the range of the numbers involved.
The grammar of a language can be usefully viewed in terms of assemblies of machines. For example, we could construct a simple sentence generating machine by connecting a “subject” machine to a “verb” machine to an “object” machine. The input could be a blank piece of paper. This is fed to the subject machine , which writes down a subject and passes the paper on to the verb machine. The verb machine adds a verb and passes the paper on. Finally, the object machine adds an object and outputs the paper. Oops, I forgot the period. So please add a period adding machine to the end of the assembly line.

This example illustrates another useful property of machine assemblies. A relatively short assembly of relatively simple machines can be capable of a large variety of possible outputs. For the machine above there are 27 possible outputs. We also see this property illustrated in our decimal system. A connected assembly of 4 decimal digit machines, for example, can generate 10000 possible numerals. And then just add two more and the assembly can generate a million!
This of course is the same principle that allows English speakers to generate their entire vocabulary with an alphabet of 26 letters, and that God uses to generate all chemicals from a hundred odd elements and all life on earth with an alphabet of 4 nucleotides.
Interestingly, there are a few principles about input/output machines that don’t depend on the specific nature of the input or the output. Let’s go back and consider our food machine business. When last we looked, our catalogue consisted of five kinds of machines: a cupcake maker, a chocolate froster, a cherry topper, a chocolate frosted cupcake maker, and a chocolate frosted cupcake with a cherry on top maker. The chocolate frosted cupcake maker was built by connecting a cupcake maker to the chocolate froster. The chocolate frosted cupcake with a cherry on top maker was built by connecting a chocolate frosted cupcake maker to a cherry topper. At some point we began getting requests for a “cherry topped chocolate froster”. This machine was to take whatever you put in and cover it with chocolate frosting and then put a cherry on top. Naturally, we built this by connecting the chocolate froster to the cherry topper. (Many people used this machine to make ice-cream sundaes.)

So business is real good and we’re selling machines faster than we can make them. Then one day I get a call from my best customer saying he needs immediate delivery of a chocolate frosted cupcake with cherry on top maker. Recall that we’ve been building this machine using a chocolate frosted cupcake maker and a cherry topper.

My
problem is that the only machines I’ve got in stock are a cupcake maker and a
cherry topped chocolate froster.
Now of course this has to happen when the CMD (chief machine designer) is on vacation. So I have to use my own brain. I look at what I’ve got and I say, “Well, I’ve got all the pieces I need: the cupcake maker, the chocolate froster, and the cherry topper. The problem is that they’re boxed up wrong. I’ve got the chocolate froster in the box with the cherry topper instead of in the box with the cupcake maker. So I tell my workers to rip off the box from the Cherry Topped Chocolate Froster, disconnect the chocolate froster from the cherry topper, put the chocolate froster in the box with the cupcake maker and connect them up to make a chocolate frosted cupcake maker and then that plus the cherry topper that’s left over from the Cherry Topped Chocolate Froster that we took apart will give us what we need. Now it will probably take us all night to do it but it can be done.
Then some kid fresh out of third grade that we just hired says, “You don’t have to do any of that! Just hook the cupcake maker up to the Cherry Topped Chocolate Froster and it will do exactly the same thing as the usual way of making a Cherry topped chocolate frosted cupcake maker.” So we try it.
We couldn’t believe it. It worked just fine! So we asked him, “How he knew it would work?” He said, “Those boxes you put the machines in are just for looks. They don’t make any difference at all to what the machines do. Take all those boxes off and here’s what you’ve got:”

You got three machines connected together. You got a cupcake maker connected to a chocolate froster connected to a cherry topper. You put the mix in the cupcake maker and get a cupcake. The cupcake goes into the froster and comes out a chocolate frosted cupcake. The chocolate frosted cupcake goes into the cherry topper and when it comes out you got a chocolate frosted cupcake with a cherry on top. You can put a box around the first two machines or you can put a box around the second two machines or you don’t have to put any boxes at all. It doesn’t matter!
This principle, which we’ll call the boxes don’t matter principle, says that when you connect a sequence of input/output machines together it doesn’t matter how you group them into boxes. So, if A can be connected to B and B can be connected to C then the machine

does the same thing as the machine

All this cupcake making etc. uses a lot of dishes so we decide to build dishwashing machines. We start with two basic machines: the Soaper-Scrubber that puts soap on a dish and scrubs it and a Rinser-Dryer that washes the soap off and dries the dish. Then we build our dishwasher by connecting the Soaper-Scrubber to the Rinser-Dryer.

These are big machines so it takes big truck to move them. We have a special assembly room where we connect the Soaper-Scrubber to the Rinser-Dryer. We put the Soaper-Scrubber at the west end of the room and the Rinser-Dryer on the east end. We attach the connector and then put a box around the whole thing. One day my workers make a mistake and put the Rinser-dryer at the west end and the Rinser-Dryer at the east end. I say, “Oh man, we’ve got to switch these.” Then I think, oh wait a minute, maybe it doesn’t matter what order I connect them in. I’ll connect them together just the way they are. After all whichever way I connect them the dishes will still go through both machines. The crew is saying, “That’s not going to work, boss!” But I’m the boss so we put it together my way and test it out. Here’s what happened!

Well I leaned a lesson that day. Putting boxes around machines may not matter but the order in which you connect the machines does!
After we got that straightened out, our dishwashers were selling really well. One day I got a call from my best customer who’s opening up a big restaurant. He wants to put the dirty dishes in at one end of the restaurant and have the clean dishes come out clear on the other side of the restaurant so he wants his dishwasher extra long. My problem is that I can’t get connectors that long.

So we’re scratching our heads over this one when that kid, remember the one that figured out that the boxes didn’t matter, says, “I know how to do it. We’ll put another machine between the Soaper-Scrubber and the Rinser-Dryer.” I say, “So tell me genius, what is it gonna do?” He says, “It’s not gonna do anything!” He calls it the “Pass-Through” machine because all it does is take whatever you put in and passes it through to the output: What you put in is what you get out.
********************************************************

The pass-through machine is usually called the identity machine, abbreviated Id, because the output is identical to the input. When you are talking to someone on the telephone the telephone system is acting like the Id machine. The microphone part of your phone is the input and the speaker part at the other end of the line is the output. You speak sounds into your input and the same sounds come out the other end’s output.

“Hi, Mom”
“Hi, Mom”
Since the Identity machine just passes its input through unchanged, connecting a machine to an Identity machine or connecting an identity machine to a machine doesn’t change what the machine does.
![]()


The identity principle says that for any input/output machine M, the machines obtained by connecting M to an Id machine or connecting an Id machine to M are the same. The three machines illustrated above are the same.
The nature of telephone technology has changed radically over the last few decades but for the next example let’s consider the telephone as it was originally designed. In that quaint version, a device called a microphone, accepted sound as input and converted it to electrical signals. The electrical signals were transmitted across the wire to another device called the speaker. The speaker then converted the electrical signals back to sound. So we can picture the microphone and the speaker as two connected machines that together make an identity machine.

Let’s look at a few more examples of identity machines. Billy and Sally like to send love notes to one another in class. Since they don’t want other people to read them, they use a secret code. The way the code works is that each letter in the message is replaced by the letter that comes after it in the alphabet. So, “A” is replaced by “B”, “B” is replaced by “C”, and so on until “Y” is replaced by “Z” and then “Z” is replaced by “A”. In terms of the circular “Code” alphabet below, each letter is replaced by the letter to the right of it. To decode the message each letter is replaced by the letter to the left of it on the circle.

When Billy sends the message, he codes it using the coding method, and when Sally receives it, she decodes it using the decoding method. We can visualize this in terms of input/output machines as in the following diagram.

Cycles in nature provide more examples of identity machines. Here is an example of a water cycle identity machine.


Notice how in each of the three examples above we have built an Identity machine by connecting two machines. The first machine does something and the second machine undoes whatever the first machine does. Thus, the microphone turns sound into electrical signals and the speaker turns electrical signals into sound. The coding machine codes a message and the decoding machine “un”codes the message. Evaporation turns water into water vapor and condensation turns water vapor back into water. We will call a machine which undoes what another machine does an Unmachine for the first machine. When you connect a machine to its unmachine you get an identity machine. In other words, if M is a machine then we will say that a machine U is an Unmachine for M if M connected to U is the identity machine for whatever input are acceptable for M. Thus, a speaker is an unmachine for a microphone, a decoder is an unmachine for a coder, and a water condenser is an unmachine for a water evaporator.
Unmachines are useful in at least two kinds of applications. One is to fix things after we make a mistake. For example, a toy-fixing machine is an unmachine for a toy-breaking machine and a “return and get my money back” machine undoes the results of a “buy the wrong thing” machine. An even more important use of unmachines is to avoid making mistakes in the first place. One could make a good case that this is the single most important kind of application of mathematics.
We’ll illustrate the idea with a couple of very simple problems. In fact, the problems will be so simple that you’ll probably be able to solve them in your head almost instantly. There is value in looking closely at simple problems and how we solve them in order to learn methods that we can use to attack problems that are more difficult.
For our first problem suppose that you want to have a tea party for five of your friends and you want each person at the party to have two cookies. The problem, of course, is how many cookies should you buy? Now I’m sure you’ve figured that there are six people at the party (you didn’t forget yourself did you?) with two cookies each so that’s 6 x 2 = twelve cookies all together. But suppose you weren’t already so smart and imagine how you might go about solving the problem. (Pretend you’re a second grader!)
One way you might try to solve the problem is to go to the store and buy a box of cookies. Then when your friends come you could pass out the cookies and see if everybody has two. (I’ll ignore the problem of how exactly you pass the cookies out. You might try lots of different ways to see if some strategy gives two cookies per person or finally decide it can’t be done with the given box of cookies.) If not, you put the cookies back in the box, go back to the store, get another box with a different number of cookies in it, and see if that produces two cookies for each person. And you’d just keep going back to the store buying different size boxes of cookies until either you find one that gives exactly two cookies to each person or your guests get tired of waiting and go home. This is a very costly and time-consuming approach.
How could we do better? One very important thing we might realize is that we don’t need to use real cookies and distribute them to our real guests. Instead, we can use imaginary cookies and distribute them to imaginary guests. One particularly nice thing about this idea is that we don’t have to wait for our real guests to arrive and so we can try to solve the problem before they arrive. Many people don’t realize it but mathematics is all about using your imagination. In applying mathematics to solve a real problem, the first step is to create an imaginary version of the problem you want to solve. It often helps in creating this imaginary version of the problem to draw pictures or use objects to create a model of the problem. In fact this process of creating an imaginary version of the problem is called modeling. The idea is that you solve the problem for the model and then once you know the solution you apply it to the real version. The advantage of this approach is that it’s usually a lot easier to try out solutions for an imaginary problem than it is for the real problem. It’s much better to make imaginary mistakes than to make real mistakes! In our example, by using imaginary cookies we won’t have to make a lot of trips to the store to buy real cookies!
Now let’s use machines, specifically a cookie distributing machine, to build a model of our cookie problem.

The input to the Cookie Distributing Machine (CDM) is a box containing some number of cookies. The output is a distribution of the cookies to six plates as equitably as it can without breaking any cookies. In terms of this model, our problem is to find an input to the CDM which results in exactly two cookies on each plate.
Now we could use the trial and error approach to “run” our imaginary CDM with different numbers of cookies but there is an easier way to solve this problem using unmachines. Just as we can imagine a cookie-distributing machine we can imagine a Cookie Collecting Machine which collects all the cookies from the plates and puts them back in the box. The Cookie Collecting Machine is an unmachine for the Cookie Distributing Machine. If I start with a box of 9 cookies and put them through the Cookie Distributing Machine and then feed the plates of cookies through the Cookie Collecting Machine I will again have a box of 9 cookies.
So using the Cookie Collecting Machine all we have to do to solve our problem is feed it the desired output of the cookie distributing machine. The result will be exactly what we have to put into the cookie distributing machine to achieve the desired output. To solve our particular problem we feed the cookie collecting machine input consisting of 6 imaginary plates with 2 imaginary cookies on each plate and count how many cookies are in the imaginary box!

Now you probably solved the original problem by just counting the cookies on your imaginary plates (maybe even using multiplication). When you do that you are in effect collecting the cookies by treating them as a part of a single collection which is what the unmachine does explicitly.
Consider another example. You and two of your friends want to buy a pizza that costs 6 dollars. Assuming you split the cost equally how much should each of you contribute? You probably solved this in your head thinking, “6 dollars, 3 people, 6 divided by 3 is 2, so 2 dollars each.” But again, let’s see how we might use a machine model of the problem to solve it.
For this problem we have a Money Collecting Machine that will take the equal contributions from each of the friends. For example, if each of the friends put in 3 dollars the machine would collect 9 dollars.

![]()


In terms of the Money Collector we can state our problem as, “Find an input to the money collector for which the output is 6 dollars.” The unmachine for the Money Collector is the “Three Friend Money Distributor.” You feed the “Three Friend Money Distributor” some number of dollars and it tries to distribute the dollars equally among the three friends.

So to solve the problem of finding what input will cause the Money Collector to output 6 dollars we can feed this desired output of 6 dollars to the Money Distributor and see how much is returned to each friend.

In this case we see that the Money Distributor, for input 6 dollars, gives each of the three friends 2 dollars; so 2 dollars is the amount each friend must give the Money Collector to make 6 dollars. At some point you learned that equal distribution is one of the ways of interpreting division. So I suspect that when you solved this problem in your head you went through this very same process, only much more quickly, to reduce the original problem to how to distribute 6 dollars equally to three people which you solved by dividing.
My point in the previous two examples is definitely not to convince you that you should change the way you solve these kinds of problems. I am very much a pragmatist when it comes to solving math problems. Once again the idea is that by understanding the process in detail we can hope to extend it to solving more difficult problems.
Let’s summarize the “UnMachine” method of problem solving. The first step is to model your problem as a problem of the form: for a given Input/Output machine M find an input that produces a given output. Then, assuming you have an UnMachine for M, you solve the problem by running the desired output through the UnMachine. I sometimes like to think of this method as “running a movie backwards.” We imagine that we have a movie of the problem being solved, for example a movie of three friends putting their money together and arriving at a total of 6 dollars. They reach into their pockets and take out some amount of money, the same amount for each, and put it in a pile having exactly 6 dollars on the table. To solve the problem we start the movie at the point where the problem is solved and run the movie backwards to see how we solved it! For example, we run the movie backwards from the point where 6 dollars is on the table until we get to the point where the friends have taken the money out of their pockets. Then we can just stop the movie and see how much they took out! Running the unmachine is exactly like running the movie backwards.
Now, of course, to apply this method we have to be able to find the unmachine for a given machine. For simple machines, by which I mean machines which aren’t assemblies of connected machines, we just have to know from experience or by being told how to “undo” what the machine does. Our experience teaches us that distributing undoes collecting and collecting undoes distributing. For compound machines, that is, machines that are constructed as assemblies of connected machines, there is a principle of unmachine construction which can help us. By looking at an example you’ll realize that you already know this principle and use it all the time! Consider the process of putting on your socks and shoes. We can visualize this in terms of a machine called the “Socks and Shoes Putter-Onner.” You put bare feet into one end of the machine and out the other end come feet with shoes and socks on. We can build this machine out of two machines: a “Socker,” which puts socks on bare feet, and a “Shoer,” which puts shoes on socked feet.


The unmachine for this will be called the “Shoes and Socks Taker Offer.” Notice that I switched the order of “shoes” and “socks” in the name and notice that this corresponds to common usage. People usually say, “put on your socks and shoes”, but, “take off your shoes and socks.” As you probably realize, at least after thinking about it, we reverse the order because when you dress your feet you first put on the socks and then put on the shoes, but when you undress your feet, you must first take off the shoes and then take off the socks. So the shoes and socks taker offer looks like:


In general, in order to undo the result of a sequence of operations, you undo each of the operations in the sequence in the reverse order in which they were performed. Thus in dressing your feet you first put on your socks and then put on your shoes. In order to undress your feet you first take off your shoes and then take off your socks. As another example, suppose that in order to get from home to my destination I go North 3 miles, then East 2 miles, and then North 5 miles, then West 1 mile. The way I undo the operation of going some distance in some direction is to go the same distance in the opposite direction. For example, to undo going north 10 miles I go south 10 miles. So to go home again, I “reverse” my path. As illustrated below, I go east 1 mile, then south 5 miles, then west 2 miles, and finally go south 3 miles.

In terms of input/output machines we state the unmachine principle as follows. Suppose Un-A is an unmachine for A and Un-B is an un-machine for B, then Un-B connected to Un-A is the unmachine for A connected to B.
We can prove the unmachine principle using the previously stated principles as follows. We connect A connected to B to Un-B connected to Un-A. We show this below where we have used the notation “M→N” to name the result of connecting machine M to machine N. Our task is to show that un-B→un-A is the unmachine for A→B. This means that we want to show that the result of connecting A→B to un-B→un-A is the Identity machine.
Because boxes don’t matter this machine is the same as the following where we’ve decided to put a box around B connected to Un-B.

Since Un-B is an unmachine for B, B connected to Un-B is the Identity machine. So the above is the same as
Which by the identity principle, i.e., the fact that the identity machine just passes its input through unchanged, is the same as
![]()
which, because Un-A is an unmachine for A, is the same as
![]()
This is what we wanted to show. We can restate this proof using our arrow notation as follows. (Instead of “boxes” we use parentheses and we’ll rename the “boxes don’t matter principle” to “parentheses don’t matter.” We give the reason an equality is true in “{}” next to the equation.)
(A→B) →(un-B→un-A) = A→(B→un-B) →un-A {Parenthses don’t matter}
= A→Id→un-A {Definition of un-B}
= A→un-A {Identity principle}
= Id {Definition of un-A}
This ends our digression into Input/Output machine properties. Now let’s see how we can apply these ideas to solving some algebraic like problems. We start by considering Input/Output machines defined by the basic arithmetic operators. For any number n we can define input output machines “Add n”, “Subtract n”, “Times n”, and, for n not equal to 0, “Divide n”. For example,
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
The last four examples above show how we can use each of the arithmetic operators to define the identity machine for numbers. Of course, we can connect arithmetic machines to make compound machines. Here are some examples.








The last four examples above illustrate the unmachine relationship between addition and subtraction and multiplication and division. That is, for any n, Subtract n is the unmachine for Add n and for any n not equal to 0, Divide n is the unmachine for Times n. (Remember that “division by 0” is not defined.)


These facts allow us to solve the find the input which gives a specified output for any of the operators.




We can also apply the principle of unmachines to solve such problems for compound arithmetic machines by connecting the corresponding unmachines in the reverse order. Thus,


It’s always nice when a new tool can be introduced in an environment where it has an obvious use and value. Expressing compound operations in terms of boxes and arrows, while descriptive, cries out for a more concise notation. This is a context in which the idea of a variable and symbolic expression would be appreciated. We might start by looking for a descriptive name for the Add 3 connected to Times 3 machine? From this we develop the idea of assigning a symbolic name, i.e. a variable, to the input and using it in an expression.
![]()
Then we notice that we can compute the final output by substituting the actual input for X and so the symbolic expression by itself is enough to describe what the compound machine does. So we then might introduce a notation like

The idea here is that the “X” label on the input indicates that that is the symbolic name we are giving to the input. Then we note that if there’s no confusion as to which name is being used for the input we can just get rid of the boxes altogether and talk about the machine “3 x (X + 3)” and explain that this form of description for a machine is called a “formula”. The advantage to this approach I think it provides an interpretation for the notion of formula. After reaching this point of abstraction we note that we can create more complex expressions including expressions that may or may not be expressible as compound arithmetic machines. This then can become a context for talking about strategies for reducing expressions to composite forms which can be solved using the techniques described above.
While the concept of machine provides an interpretation for formulas and equations involving numbers we so far have no interpretation for numbers themselves or operators on numbers. The goal of the remaining articles is to use concepts based on machines to develop such interpretations and use them to understand the behavior and application of numbers.
The purpose of this section is to fill in certain details that would interrupt the flow of the concepts discussion if discussed immediately. Some of these remarks may fall into the category of pedantic – please forgive me. (Maybe this section should be called “Pedantics”?) I will also use this section to introduce “standard” terminology and notation for the concepts discussed in this article. (For example what I have called the “boxes don’t matter” principle is generally called the associative principle.)
At various points in our
discussion, we have considered questions about whether two machines are the “same”
or “do the same thing.” For example, in
looking at the result of connecting the soaping and rinsing machines, we
noticed that switching the order in which the components were connected could
result in machines that were not the same. (One produced clean rinsed dishes and the other produced soapy
dishes.) On the other hand, we
asserted in the “boxes don’t matter principle” that two connected sequence of
machines which differed only in the way the component machines were grouped
were the same. This raises the
question of what exactly we mean when we say two Input/Output machines are the same. We obviously don’t intend “same” to mean
identical. After all, two compound
machines with different groupings of components are clearly not identical.
In general, the meaning of an
assertion that two things are “the same” depends on the context in which it is
used. For example, someone may see me
in my new car and say, “I just bought the same car!” Unless one of us was swindled, he doesn’t
mean that the car he bought is literally the same one as mine. From experience, we know that he probably
means that he bought the same model of car. Sameness comes in varying degrees with one extreme being absolute
identity. We can measure the degree to
which two objects are the same by the collection of statements about the
objects on which they must agree. If A
is absolutely identical to B then every statement, which is true of A, will be
true of B and vice versa. If A and B
are the same model of car, but not “one and the same”, then the truth of
statements about the manufacturer and certain features of the car, like the
number of seats, must be the same. On
the other hand, the truth of other statements, like where the car was purchased
or the exact physical location of the car at a given time may differ.
Usually a notion of sameness
is defined by some collection of tests we can apply to determine whether the
objects are the same. A test for
literal identity of a car might consist of comparing vehicle identification
numbers or determining whether the cars are located at exactly the same
physical location at the same time.
Another kind of sameness test might put the two objects on a balance
scale and see if the scale is balanced.
This notion of “sameness” would guarantee that statements concerning the
weight of the objects would have the truth value.
Regardless of what class of
tests are used to define a notion of sameness, there are certain properties
which a “proper” mathematical notion of “sameness” is expected to satisfy:
Any relationship between
objects which satisfies these properties, is called an equivalence relation. The first two properties of an equivalence
relation seem clear enough. No test can
distinguish something from itself and any test that would distinguish A from B
would distinguish B from A. Transitivity,
however, is more problematic. Consider
for example, the notion of “sameness” based on color where the sameness test
for two objects is based on whether or not I, the author of this article, when
presented with two objects can distinguish their color. Consider the following sequence of colored
tiles.
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Looking at the entire
sequence, I can see that the color is gradually changing from red towards the
blue end of the spectrum. On the other
hand, when looking at any two adjacent tiles, I personally cannot distinguish
the colors. The change is sufficiently
gradual that I can only distinguish the cumulative change over several tiles. Perhaps your color sensitivity is greater
than mine. If so imagine that we
allowed the same change to occur over 10 or 100 or 1000 times as many
tiles. Surely, at some point it would
be the case that you could not distinguish the colors of any two adjacent tiles
even though you could easily distinguish between the color of the first and the
last tile. This notion of “sameness”
based on our ability to perceive differences in color is not transitive. If it were, then the first and the last
tiles would have to be the same. To see
this, suppose the relation of having indistinguishable color were
transitive. Since the first and second
tiles are the same, and the second and third tiles are the same the first and
third tiles must be the same. Then,
since the third and fourth are the same it would have to be the case that the
first and fourth tiles are the same. In
the same we would conclude that the first and fifth tile were the same and then
the first and sixth and so on until we concluded that the first and last tiles
had to be the same.
A property like color, for
which differences can be made arbitrarily small, is called a continuous
property. As another example, consider
liquid volume and imagine the change caused by adding one drop of water to your
bath. I doubt that anyone could
perceive the effect of one drop on the volume.
However, if allowed to continue long enough you would eventually notice
because there would be bath water all over the floor, which would eventually
fill your house and eventually cover the world! Perception in the real world, even perception enhanced by
instruments and measuring devices, is inexact for continuous properties and so
a notion of sameness based on whether there is a perceptible difference in a
continuous property will in general not be transitive and therefore will not
define an equivalence relation.
A property for which
differences are always perceptible will be called separated or discrete. For a separated property sameness based on
that property is transitive. To see
this, suppose that on the basis of a separated property, A is the same as B and B is the same as
C. Then A, B, and C must all have
identical values for the property and so A and C do and so A is the same as C. One way to ensure that a property is
separated is to restrict the collection of objects that we are willing to
compare. For example, in general,
height is a continuous property, but suppose we choose a particular size block
whose height is perceptible and only compare towers built by stacking blocks of
this size. Within this collection of
objects, if the heights of two objects differ they will differ by at least the
height of one block and so the difference will be perceptible.
![]()

![]()
If we are working with a
collection of objects for which “sameness” is an equivalence relation, we may
write “a = b” for “a is the same
as b.” Unless we say otherwise, we will
assume that a notion of sameness used in some context is an equivalence
relation.
When we want to say that two
objects are one and the same, i.e. must be equivalent in all equivalence
relations, we will say that they are identical.
It isn’t always possible to
connect two input/output machines. We
can’t connect a cupcake making machine to a telephone or a candy making machine
to a dishwasher, for example. The
proper input for a telephone is a sound and the proper input for a dishwasher
is a dirty dish..

![]()



The class of objects or
values that an input/output machine can accept as input is called the domain
of the machine. The domain of a
dishwasher consists of dishes and the domain of a telephone consists of
sounds. The class of values or objects
which a machine can output is called the range of the
machine. The range of the Hershey Bar
Factory consists of Hershey Bars and the range of a properly functioning
dishwasher is clean dishes.
In order to connect machine M
to machine N to form M→N, the possible output of M must be acceptable
input for N, or in terms of the new terminology, the range of M must be
contained in the domain of N. For
example, since the class of clean dishes is contained in the class of all
dishes, we can connect a dishwasher to another dishwasher (though this is
wasteful since on already clean dishes, a dishwasher acts as an identity
machine.) Since Hershey bars are a form
of candy we could connect the Hershey Bar factory to a machine whose domain is
all candy, e.g. a candy wrapping machine.
Note that the domain of M→N
is identical to the domain of M.
We will make an assumption
regarding the notion of sameness for a domain:
If a1 is in the domain of M and a2 is the same as
a1 then a2 is in the domain of M. In other words, if a machine accepts an
object as input then it must accept any object which is the same as the first as
input.
We want to define a notion of sameness for input/output machines that is based solely on the relationship between input and the corresponding output produced by the machine. In determining whether two machines are the same in this sense, we are not concerned with how the machines are constructed, or the appearance of the machines, or even how quickly or efficiently the machines operate.
One requirement for two machines to be the same is that they must have the same domain and range. This will ensure that wherever one of the machines can be connected, the other can also. The other requirement we would like to impose for two machines to be the same is that there is no test based solely on analyzing the input and corresponding output of the machines that can distinguish between the two machines. You can imagine that in trying to distinguish between the two machines we ask a third party to feed a collection of different inputs to the two machines and report the corresponding outputs returned by the two machines. The machines are the same if no collection of inputs and corresponding results would enable us to distinguish between them.
Unfortunately, it is not obvious how to formulate the second requirement precisely. As an example of the kind of problem this poses, consider a machine that operates on a whole number input by adding the result of rolling a pair of dice to the input.
So for example, if I feed 2
into the input and roll a 5 then the output will be 7. If I feed 2 in again and roll a 9, the
output will be 11. This is an example
of what is called a non-deterministic machine. This means that even if we have complete knowledge about the
input and the history of the machine we cannot predict with certainty the value
of the output. In particular, there is
no guarantee for such a machine that the same input will generate the same
output. Now suppose someone places this
machine a black box and allows me to test it by feeding it various input and
observing the output. Then they have me
leave the room so that they can have a chance to replace the machine with a
possibly different machine. They then
have me return and ask me to determine whether the machines in the box is the
same or different than the machine I was testing before. I run my test through the possibly different
machine and compare the new output to the output from the previous experiment. The fact that the results may be different
or even the same does not in itself guarantee that the machines are different
or the same. For example, the tester
may not have changed the machine at all and I would almost certainly see at
least some differences in the output because of the randomness of the
dice. On the other hand, the tester
might have changed the machine by replacing the dice with a different
randomizer, perhaps a spinner, with different odds for the possible outcomes,
and yet, again through randomness, the output might still be the same
For the purposes of
understanding numbers and arithmetic we can avoid subtleties like these by
restricting the kinds of input/output machines we consider to a more reliable
subset.
A function is
an input/output machine for which the output is completely determined by the
input. More precisely, an
input/output machine f is a function if for any a1, a2 in
the domain of f with a1 the same as a2, if at some time,
place, and context f on input a1 outputs b1 and at some
time, place, and context f on input a2 outputs b2 then b1
is the same as b2.
A function has the property
that the same input will always produce the same output. For example, non-deterministic machines,
like the dice-machine described above are not functions. The names f, g, and h are
typically used for functions. If f is a
function and b is the output of f for input a then we may write f(a)=
b and say “f of a equals b.” This
notation makes sense because f always outputs (something the same as) b
for input a. Also, in case f(a) = b, we call b the value of “f of a” or
say “b is the result of applying f to a.”

Defining sameness for
functions is straightforward.
Functions f and g are
the same, which may be written f = g, if
As we mentioned earlier, the
important consequence of a notion of sameness is that for a certain class of
statements, what’s true of one object will be true of any other object which is
the same as the first. For functions
this amounts to the following:
Suppose
f1 = f2 and g1 = g2 and f1
can be connected to g1. Then
f2 can be connected to g2 and f1→g1
= f2→g2 . (If a relationship, e.g., sameness, which
holds between objects also holds for the results of applying some operation,
e.g. connection, to those objects then we say that the operation preserves
the relationship. We are thus
asserting that connection preserves sameness for functions.)
To
show that f2 can be connected to g2 we need to show that
any possible output of f2 is an acceptable input for g2. So suppose b2 is a possible
output of f2 . Then, for
some a, which is an acceptable input of f2, we have f2(a)
= b2. Since f1= f2,
a must be an acceptable input for f1. Suppose f1(a) = b1. Since f1 = f2, b1 is the same as b2. Since f1 can be connected to g1,
b1 is acceptable input to g1 and so since g1 =
g2, b1 is acceptable to g2 and so since b1
= b2, b2 is acceptable to g2.
To
show that f1→g1 =
f2→g2 we need to show that they have the
same domain and that for any a in that domain, (f1→g1)(a)
is the same as (f2→g2)(a). The domain of f1→g1
is identical to the domain of f1. which is identical to the domain
of f2, because f1 = f2, and the domain of f2
= domain of f2→g2. Thus the domains are the same. Let a be an element of this common
domain. Suppose f1(a) = b1,
g1(b1) = c1, f2(a) = b2, and g2(b2)
= c2. So (f1→g1)(a)
= g1(f1(a))=g1(b1) = c1
and similarly, (f2→g2)(a) = c2. We want to show that c1 is the
same as c2. Since f1
is the same as f2 we must have that f1(a) is the same as
f2(a), and so b1 is the same as b2. Hence, since g1 is the same as g2,
g1(b1) is the same as g2(b2), and
so c1 is the same as c2. Thus f1→g1 = f2→g2.

From now on we will generally
“confuse” sameness with equality. That
is, we will write “a = b” when we mean “a is the same as b” and the notion of
sameness will be understood from the context[1] We will further use the property that equals
may be substituted for one another in expressions and sentences without
changing the value or truth of the expression or statement.
There at least three mechanisms by which an input/output machine may fail to be a function, that is, that the same input may generate different outputs depending on the time, place or context in which the machine is operated. As we have seen, one way is if the output of the machine is affected by some random operation like the roll of dice. Another is if the input/output machine has a memory. For example, imagine yourself as a machine standing in a kitchen and your input is a hot stove and your output will be some action. The very first time you encounter the hot stove your action might be to touch the hot stove. This will be very painful of course and the memory of this will be stored in your brain so that the next time you encounter the hot stove input, your action will probably be different!


For a less dangerous example, consider a machine which has a memory for one number. When the machine is first made, a 0 is stored in the memory. The action of the machine on input n is to add n to the value in the memory, store the result in the memory, and output the result. We illustrate this below for inputs equal to 1. (The contents of the memory are represented by the box inside the machine.)




The sequence of diagrams above shows the memory of the machine and the output for successive inputs of 1. When 1 is input the first time, the content of the memory is 0. The machine outputs (1 + 0) = 1 and stores 1 in the memory. When 1 is input the second time, the content of the memory is 1 and so the machine outputs (1 + 1) = 2 and stores 2 in the memory. The third time 1 is input, the machine outputs 3 and stores 3 in the memory and so on. We see that the behavior of the machine changes over time as a result of past experience. The attribute of a machine which causes its behavior to change as a result of previous operations is called the state of the machine. We’ll argue in the next article that the notion of state is at the heart of the concept of number. Machines with state however do not behave as functions.
A third way in which a machine may fail to behave as a function is if its behavior is affected by the context in which it is used. As an example, imagine that we are operating a bakery with many workers including two who we’ll call Bob and Sally. Imagine that Bob is madly in love with Sally. Bob is a cookie decorator and Sally is a cookie maker. We connect cookie makers to cookie decorators to create “decorated cookie” makers.


![]()
![]()
![]()
![]()
![]()
![]()


![]()
![]()
![]()
![]()
![]()
![]()
When Bob is connected to any cookie maker other than Sally he decorates the cookie with a blue happy face, but when connected to Sally he can’t help but place a big red heart on the cookie. Bob is not behaving like a proper function. Functions only care about the input itself, not where it came from or how it was received.
If we have three input/output machines I, J, and K such that I can be connected to J and J can be connected to K then we can form new machines (I→J)→K and I→(J→K) or diagrammatically


As we have discussed, the boxes don’t matter principle, which in more refined discourse is called the associative principle, is the assertion that under the assumptions above, the two machines (I→J)→K and I→(J→K) are the same. The idea is that both machines are really just

with some purely decorative boxes (the green and red boxes in the diagrams above) put around various parts in different ways. Now this may seem like a simple observation, but, as we shall see in subsequent articles, this principle is the basis for all of the laws of arithmetic!
One way to understand this principle is to consider how it could fail. If the associative principle were going to fail then somehow the underlying machines, I, J, and K would have to be “aware” of the outer boxes and that awareness would have to affect their behavior. Now all devices are to some extent dependent on their environment. For example, they may need a power supply and they may only operate within a certain temperature range. This kind of dependence determines whether the machine operates at all, but, but as long as the environmental conditions allow the machine to operate we do not expect the machine to be aware of the environment. In particular, we would not expect the behavior of the device to be affected by the source of its input or the destination of its output. We will call machines whose behavior is independent of the source of their input and the destination of their output, components. In our earlier example of the cookie factory, Bob was not behaving as a component because his behavior differed depending on whether he was accepting input from Sally or someone else.
Components will satisfy the associative principle. For, suppose I, J, and K are components and consider the behavior of the differently associated machines, (I→J)→K and I→(J→K), for some acceptable input a. In the first form, (I→J)→K, a is fed into (I→J) which feeds a into I while for the second form, I→(J→K), a is fed directly into I. In the first case the output of I will be fed directly into J while in the second the output will be fed into (J→K) but since I doesn’t care about source or destination it will compute in exactly the same way. Similarly J will receive the same input, compute its output in the same way and that output will be fed to K which produces the final output independent of the source of its input.
Since the value of a function only depends on the input and not on the source the input of the use of the output, functions are components and so will satisfy the associative principle. Of course, we can also verify this directly. Suppose f, g, and h are functions and f can be connected to g and g can be connected to h. The domain of (f→g)→h is the domain of (f→g) which is the domain of f which is the domain of f→(g→h). Now let a be in the domain of f. Suppose f(a) = b and g(b) = c and h(c) = d. The two diagrams below illustrate the computations of (f→g)→h and f→(g→h) for input a.
We’ll interpret these diagrams starting with the first. Notice that the diagram consists of three levels. Each level can be thought of as a collection of one or more problems to be solve. The top level represents the main problem: “What is the value of ((f→g)→h) for input a?” We can’t solve this problem directly so we reduce this problem to two simpler problems: “What is the value of (f→g) for input a?” and “What is the value of h when input the result of (f→g)(a)?” The red dotted arrows in the diagram go from higher levels to lower levels and indicate where we have reduced a harder problem to a set of simpler problems. The problem of computing the value of (f→g) on input a is still too complicated to solve directly so we again reduce it to simpler problems: “What is the value of f on input a?” and “What is the value of g on input f(a)?” These two problems can be solved directly: the value of f(a) is b and the value of g(b) is c. By combining the solutions of these simpler problems, we obtain the result (f→g)(a)= c. The green dotted lines in the diagram go from lower levels to higher levels and indicate where solutions to simpler problems have been combined to obtain a solution to a harder problem. Thus the green arrow from the third level to the second level indicates that we have combined the solutions of the problems for f and g to obtain a solution to the problem for (f→g): (f→g)(a) = c. We then directly solve the problem of determining the value of h on input c and then combine our solutions to obtain the solution for the original problem: ((f→g)→h)(a) = d.
The analysis of the second way of expressing f→g→h is essentially the same except strategy for breaking the problem into simpler problems is used. Again we obtain the result (f→g→h)(a) = d. In effect, the associative principle is a guarantee that regardless of how we break the problem down, the results will be the same. The problem solving strategy illustrated by these diagrams is a very general one which applies as well to composing pictures or writing essays or solving engineering and scientific problems as to evaluating mathematical expressions. This problem solving strategy is sometimes called “divide and conquer.” The basic steps of divide and conquer are:
· Divide a complex problem into a a collection of simpler component problems.
· Solve the component problems.
· Combine the solutions to the component problems to construct a solution to the original problem.
The process of dividing a complex problem into simpler problems is called reduction. The process of combining component solutions to construct a solution of the original problem is called synthesis. Note that, as in the example, the component problems may in turn be solved by divide and conquer. In general, we may have to continue down many levels of the upside down tree until we finally reach problems which we can solve without further reduction. In order for this technique to work it must be possible to infer the behavior of a combination of components from the behavior of the independent components. We can thus view the associative principle as a kind of guarantee that a reduction/synthesis approach to analyzing behavior will work. It is this property of mechanical, electronic, and computer software components that allows us to create, manage, and maintain vast and complex systems with confidence that their behavior can be predicted and reasoned about.
Murphy’s law – anything that can go wrong will go wrong – has a corollary that might be called the “Tower of Babel principle.” This principle guarantees that if there are multiple inconsistent ways of denoting something then notations will be chosen so as to maximize confusion. This principle applies to the notation for the result of connecting two functions, which I have denoted using an arrow, as in “f→g”. This notation indicates that the processing of an input is carried out “left to right”. Thus, in applying f→g to an input a, we first apply f to the input to obtain f(a) and then apply g to the input to obtain g(f(a)). Thus (f→g)(a) = g(f(a)). Does something about this notation make you uncomfortable? If so, I suspect it’s the fact that the order in which f and g appear has been reversed. If we had used “(a)f “ to denote the output of f for input a then we would have had (a)(f→g) = (a(f))g and the left to right order would be preserved. But, of course, that’s not the way it’s done. In order to avoid this inconsistency, mathematicians use a different notation for the result of connecting two functions that uses a right to left processing order. This operator is called composition and is denoted by “◦” as in “g◦f” and (g◦f)(a) is defined to be g(f(a)). So in compositional notation the right function is applied first and then the left function is applied to the result. Thus g◦f = f→g. So “→” means process left to right and circle means process right to left. In order to emphasize the model of functions as machines, I will primarily use the arrow notation. The reader should be aware, however, that in polite mathematical society, compositional notation, i.e. the circle, is the convention.
The conventional term for what I have called the “Un-Machine” for a function is “Inverse Function.” For example, if f is the function that multiplies by 2 then the inverse of f is the function that divides by 2. Certain inverse functions are given special names. For example, the inverse of the square function is called the square root function, and the inverse of the trigonometric function sin is called arcsine. For reasons that will be discussed in a later article, the inverse function is sometimes denoted using a superscript “-1” as in f--1 .
While we have discussed how to construct the un-machine for a compound machine from the un-machines for its components, (by connecting the un-machines in the reverse order) we haven’t said anything about how to find the un-machine for a basic machine nor have we addressed the question of when such a machine exists at all. The only requirement on an un-machine U for a machine M is that when U is presented with the output of M it somehow produces the input that was fed to M. Now it is not the business of mathematics to take a position on whether phenomena in the category of magic or extra sensory perception exist. Given that, it seems impossible impose any general requirements on when an un-machine can exist. If U reads M’s mind or uses incantations to accomplish that, so be it. If we restrict ourselves to functions by requiring that both the machine and its un-machine be functions, however, then we can impose a necessary condition on the existence of an un-machine.
Suppose f and g are functions and g is an inverse (remember,
that’s the proper term for the un-machine of a function) of f. Then, for any a from the domain of f, if
f(a) = b, then g(b) = a. Now suppose a1
and a2 are in the domain of f and f(a1) = f(a2)
= b. Then g(b) must = a1,
because f(a1) = b, and g(b) = a2, because f(a2)
= b. So, since a1 and a2
are both equal to the same value, g(b), we must have a1 = a2. We conclude that for any two elements a1
and a2 form the domain of f, if f(a1) = f(a2)
then a1 = a2.
Stated differently, if f has an inverse function then f cannot
produce the same output for two different inputs.
One way to picture a function is as a collection of arrows mapping the domain to the range. In order for a function to have an inverse, we cannot two different arrows arriving at the same point. The picture below, illustrates the problem. The green arrows represent a function with two arrows starting at different points, a1 and a2, and arriving at the same point, b. In order for the inverse (indicated by a red arrow in the opposite direction) to exist we must be able to map every arrival point back to its unique source point. But b has at least two source points and so our attempt to create an inverse function is stuck trying to compute the inverse of b.

The picture below illustrates the arrow diagram for the “Add2” function, i.e. the function whose domain is all natural numbers and whose range is the collection of natural numbers greater than or equal to 2. Since no two arrows land at the same target value, this function can have an inverse and in fact the “Subtract2” function whose domain is the collection of natural numbers greater than or equal to 2 and whose range is all natural numbers is the inverse function for “Add2”.

Note that if we have the arrow diagram for a function which has an inverse then we obtain the arrow diagram for the inverse function by just turning the arrows around.

Below is the arrow diagram for the “Times 2” function whose domain is the set of natural numbers and which doubles its input.

The range of the “Times 2” function is the collection of even natural numbers. As in the previous example we see that there is exactly one source for each target value and so Times 2 can have an inverse. In this case the inverse is the Divide by 2 function whose domain is the even numbers and whose range is all natural numbers.
As these two examples illustrate, if a function f has an inverse then the domain of the inverse is the range of f and the range of the inverse is the domain of f.
Functions, like the two examples above and any function having an inverse, which are such that no two distinct inputs have the same output are called 1-to-1 (read “one to one”) functions. The relationship between the domain and the range defined by a 1-to-1 function is called a 1-to-1 correspondence because the function relates each element of the domain to exactly one element (no more no less) of the range. When we count a collection of objects we create a 1-to-1 correspondence between some initial interval of positive numbers, 1, 2, …, n (where “n” is the number of objects in the collection) and the objects in the collection. Thus, when we count the fingers on our hands we define a 1-to-1 correspondence between the numbers 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 and the fingers on our hand.
Now consider arrow diagram for the “Times 0” function which multiplies natural number inputs by 0.
This function is the exact opposite of a 1-1 function. Since 0 times anything is 0, all inputs generate the same output. So this function has no inverse. (This is why you can’t “divide by 0.”) The problem of finding an inverse, that is, what input yielded some result is like a trying to solve a mystery in a detective novel. The input is like the “culprit” you are trying to catch and the output produced by that input is a “clue” to the mystery. If a function is 1-1 then the single clue provided by that function is enough to uniquely determine the culprit – though whether or not we can trace the clue back to the source may depend on how clever we are. Even if a function is not 1-1 knowing the output may at least reduce the set of “suspects” but in the case of the Times0 function however, the clue gives us no information whatsoever and does nothing to reduce the set of suspects.
In a good mystery, the detective will generally require more than one clue. With that in mind consider the following two examples. The “Remainder 2” (Rem2) function accepts natural numbers as inputs and outputs the remainder after dividing the input by 2. If the input is divisible by 2 then the remainder is 0 and otherwise the remainder is one. The arrow diagram for this function is shown below.

The Rem2 function doesn’t uniquely determine the culprit but it at least tells us whether he’s even or odd. Rem2 has a partner function, “Quotient 2” (Quo2) which accepts natural number inputs and outputs the whole number of times 2 goes into the input. This function outputs 0 for inputs 0 and 1, 1 for inputs 2 and 3, 2 for inputs 4 and 5, and so on. So its arrow diagram looks like the following.

In this case, each possible output corresponds to only two possible inputs: one even and one odd. So the Quotient 2 function preserves a lot of information about the input but not enough to uniquely determine it. Notice that while neither Remainder 2 nor Quotient 2 by itself preserves enough information to determine the input, if we have the output of both Quotient 2 and Remainder 2 we can determine the input. For example, suppose we know that Quotient 2 outputs 3 for a given input and Remainder 2 outputs 1 for the same input. From the Quotient 2 information we know that the input must be either 6 or 7 and from the remainder we know that the input must be odd, and so we know the input must be 7. In this case we need two clues to solve the mystery. One way to picture this is to consider a machine that outputs 2 values: one the value of Quo2 and the other the value of Rem2. Call this the QR2 machine. The inverse QR2 machine accepts two inputs called q and r and outputs (2 x q) + r
While neither Quo2 or Rem2 by itself is 1-1, QR2, which computes the pair of values is 1-1 and as we see has an inverse. The table below shows the Quotient 2 and Remainder 2 values for inputs 0 through 9. As we see, no two rows have the same values for both Quotient 2 and Remainder 2.
|
INPUT |
Quotient 2 |
Remainder 2 |
|
0 |
0 |
0 |
|
1 |
0 |
1 |
|
2 |
1 |
0 |
|
3 |
1 |
1 |
|
4 |
2 |
0 |
|
5 |
2 |
1 |
|
6 |
3 |
0 |
|
7 |
3 |
1 |
|
8 |
4 |
0 |
|
9 |
4 |
1 |
Our final two examples will be used to illustrate a two ways in which a function which is not 1-to-1 and therefore does not have an inverse, can be “made” 1-to-1 by modifying the domain of the function. The first of these two functions is the square function, which squares its input, and, for this example, we will allow all whole numbers, including negative numbers as input. For reasons discussed in a later article, the product of two negative numbers is positive, and so the square of a negative number is positive. The arrow diagram for this function is as follows. (We’ve use different scales for the domain and range in order to show more values.)

In this case, every value in the range, except 0, corresponds to two elements of the domain and so this function, as is, is not invertible. One way to “make” a function which is not 1-to-1 into a 1-to-1 function is to restrict the function’s domain to a “subdomain” on which it is 1-to-1. In the case of the square function, there are two natural choices. One is to restrict square tothe the non-negative whole numbers, i.e. 0, 1, 2, … Call this version of square, “square+”. The other is to restrict square to the non-positive numbers, i.e. 0, -1, -2, … Call this version of square, “square–.” The square root function “√” is the inverse function for square+ and the negative square root function “-√” is the inverse for square–.




For this next example imagine that you own a “parts store” which sells geometric shapes. Your inventory consists of one purple triangle, with model number T25IP, two red triangles with Model Number “T25IR”, three green circles with Model Number “C25G”, and two blue squares with Model Number “S25B”. In order to track your inventory each specific part has a unique serial number which is obtained by appending a sequence number to the model number. For example, the two red triangles have serial numbers: T25IR1 and T25IR2. We will be interested in two functions defined on the parts. One maps each part to its serial number and the other maps each part to its model number. The arrow diagrams are as follows.

As we see, the serial number function is 1-1. This is as expected. After all, the all point of a serial number is to uniquely identify each part in our inventory. If there were two with the same serial number we would lose track of the parts with the duplicate numbers. The inverse serial number function maps each part to its unique serial number. This is what we would do when we “check our inventory”, i.e. verify that there is a part matching every serial number in our records, so I will call it the “inventory function.”
The model number function is not 1-1. Parts having the same shape, color, and size have the same model number. But of course there may be many parts having a given model number. On the other hand we sometimes treat the model number as if it uniquely identifies some object. A customer might say “I need the C25G for may drawing.” Or he might ask, “What is the color of the S25B” or “What is the model number of the 0.25 inch green circle.” When we do this we are in effect treating all parts having a given size shape and color as if they were the same. We are changing the domain by changing the meaning of “sameness” on the domain and as we discussed earlier this is fine so long as we confine this usage to statements about which all “same” objects agree. We can picture this new version of the model number function where we declare all objects with the same shape color and size to be the same as in the following diagram.
What we have done is replace to replace all the real objects having a given model number with an imaginary object. The imaginary object does not have all the properties of a real object. It doesn’t have a serial number or a physical location. The only properties it has are size, shape, color and model number. As long as we stick to topics related to these specific properties, we can talk about this imaginary object as if it were real. For example, we can say, “The color of the S25B is blue.” Or ,“The S25B is too big for your painting.” Of course what these statements really mean is that any object which is the same as S25B with respect to these properties will satisfy these statements. The technique of creating an imaginary object to represent the common properties of some collection of more or less real objects is called abstraction. This technique allows us to isolate the essential features of systems and reason in such a way that our conclusions apply to an entire class objects.
In this article we have attempted to lay the foundation on which we will build our models for understanding the meaning of numbers and arithmetic.
[1] In life we handle this distinction by imagining that any kind of sameness test corresponds to some component of an object. In a tradition going back at least to the ancient Greek philosophers, we imagine that objects are collections of “essences” and that essences are unique. Thus, when I say, “the color of my coat is red”, I imagine my coat as a collection of essences, including the color essence and that the color essence is identical to red. Thus I would be entitled to write color(coat) = red.