A TREE is a data structure in which the data items are NOT
arranged in a simple list. This means it is not necessarily natural
to store these items in an array - this makes it significantly
different from a LIST, a STACK, or a QUEUE.
For example, the tree below is used to store all of the 3 letter
words which can be made using the letters A and B:
Nothing
/ \
/ \
A B
/ \ / \
AA AB BA BB
/ \ / \ / \ / \
AAA AAB
Notice that each two-letter word "gives birth" to 2 different 3-letter
words, by adding either an A or a
B. Thus, we refer
to
as the "children" of AB, and we call AB the "parent".
A tree CAN contain more than two children per parent - for example,
draw a tree containing all 3-letter words made from A,B, and C. In this
case, each small word has 3 children, formed by adding A,B, or C.
The tree above, in which each parent has 2 children, is called a
BINARY tree. The children are commonly called the LEFT-CHILD and the
RIGHT-CHILD. The orientation (left or right) is sometimes unimportant,
other times very important.
More vocabulary: The "words" in the tree above are called NODES.
The lines are called BRANCHES.
A NODE with no children is sometimes called an ATOM.
A part of the tree, for example all of the words
starting with A, is called a SUB-TREE.
A tree can be used to "think ahead" in a game, such as TIC-TAC-TOE.
Below is a tree starting in the middle of a game of tic-tac-toe.
. X .
. O O
/~~~~~~~~~~~~ X O X ~~~~~~~~~~~~\
/ | \
/ | \
. X . X X . . X X
X O O . O O . O O
X O X X O X X O X
/ \ / \ / \
O X . . X O X X . X X O O X X . X X
X O O X O O O O O . O O . O O O O O
X O X X O X X O X X O X X O X X O X
| | ------- | | -------
O X X X X O O-WINS! X X O O X X O-WINS!
X O O X O O X O O X O O
X O X X O X X O X X O X
------- ------- ------- -------
DRAW X-WINS! X-WINS! DRAW
Notice that this is NOT a binary tree. Nor is it a tree with 3 or 4
branches per node. In this case, there are fewer and fewer branches
at each level (since there are fewer and fewer free squares as the game
progresses), and some branches stop, because the game is over.
Being 2-dimensional, trees are NOT a natural structure for
a digital computer, whose memory locations are arranged in a
simple list (array.) In order to store a tree in the computer,
we need to make special arrangements.
One way would be to DRAW the tree on the screen, using a mouse,
and then store the picture by storing the pixels. This is a very
bad method, since a simple question like "How many nodes are there
in this tree?" is extremely difficult to answer.
A better method is to store the tree inside an array. To do this,
we need to assign each node to a position in the array. Then, to
represent arrows, we use "pointers" containing the "address" of a node -
this is its position in the array.
The three-letter-AB words tree from the previous page is
represented in arrays here:
Position Word Parent LeftChild RightChild
-------- --------- --------- --------- ----------
1 nothing 0 1 2
2 A 1 4 5
3 B 1 6 7
4 AA 2 8 9
5 AB 2 10 11
6 BA 3 12 13
7 BB 3 14 15
8 AAA 4 0 0
9 AAB 4 0 0
10
11 ABB 5 0 0
12 BAA 6 0 0
13 BAB 6 0 0
14 BBA 7 0 0
15 BBB 7 0 0
Look at the 3rd row, under LeftChild. This number 6 means that
the left-branch in the tree from B goes to BA. A ZERO indicates
that there is no branch. For example, ABB has no children, so both
child pointers are 0. Similarly, the NOTHING entry (also called the
ROOT) has no parent, so the parent pointer is zero.
Problem 1: Finish representing the tic-tac-toe tree in
an array. You will need a large piece of paper, since
each node contains a board, rather than a word.
Position Board Parent LeftChild MiddleChild RightChild
. X .
1 . O O 0 2 3 4
X O X
. X .
2 X O O 1 5 0 6
X O X
X X .
3 . O O 1 7 0 8
X O X
----------------------------------------------------------------------
A common example of a binary tree is the process used by young
mathematics students to find the prime-factorization of a number.
The algorithm can be described as follows:
(1) Get a blank sheet of paper (Initialize)
(2) Write the number in the middle at the top of the page. (Root)
(3) Think of two numbers which, when you
(
multiply them, would equal "the number."
(a) If you found two numbers, then draw left
and right branches and put these numbers
at the ends.
(b) If you did not find two numbers, then
there is nothing more to do with this number.
(4) Do 3 for each number which you wrote in 3. (Repeat)
If none of the new numbers can be split, then stop.
(5) Write down all of the ATOMS - those numbers which
were not split - this is the prime-factorization.
For example, prime factoring 120 can give a variety of trees:
120 120
/ \ / \
12 10 2 60
/ \ / \ / \
3 4 2 5 2 30
/ \ / \
2 2 2 15
/ \
The prime factorization is: 3 5
120 = 3*2*2*2*5
Oddly enough, although a variety of trees are possible, they all contain
the same atoms - that is, the prime-factorization is unique.
The general algorithm for building up a binary tree is -
(1) Initialize - empty the tree
(2) Root - get a value for the root
(3) Split an atom
(4) Repeat (3) until finished
This concept is quite natural, but it is a bit tricky to keep track
of things in arrays. Thus, it is important to write good procedures
for (1) (2) and (3) - splitting an atom is especially tricky.
Problem (2) : Write procedures for (1),(2), and (3). Then use them
to input a BINARY TREE, for example a
prime-factorization tree. The
procedure should work as follows: the user types in the "name" of an atom,
and the names of the two children, and the appropriate branches are added
to the arrays. Then write a 4th procedure - print atoms. This is a very
simple procedure - it simply searches the list of nodes, looking for zero
pointers in the left and right children.
Problem (3) : Automate this process, so that you write a program
which prime factors a number by using the tree
method. After each
the computer shows the list of atoms. The user types the name of an atom,
and the computer splits that atom if it is possible. Thus, the user
decides which atom to split, but the computer takes care of finding the
factors. This means you will need a new procedure which accepts an
integer as input, and returns two factors. It could return 0,0 for a
prime input - that is, a number with no factors. In this case,
no split occurs. Do not allow the following:
15
/ \
3 5
/ \
1 3
/ \
1 3
.............. Infinite tree, getting nowhere.
Problem (4) : Automate the process further: the computer chooses the
atom which is to be split. It loops, by itself, until none of the
atoms can be split. Then it stops! This is a bit tricky. Now you
should be able to type in a number, and the computer does the prime
factorization by itself.
Problem (5) : Although it is of no particular theoretical importance,
it is nice for people to be able to SEE a tree. The drawings above are
quite difficult to produce using a computer program, and have the
possibility of growing too large for the screen. Look at the Explorer
program in Windows . This is one possible way to represent a tree.
Another method, which makes sense in the prime-factorization problem,
is the following:
120
(12)(10)
((3)(4))((2)(5))
((3)((2)(2)))((2)(5))
There is one "formula" for each "level" in the tree.
Try writing a SHOWTREE procedure, which prints this list of formulas.
Problem (6) : The prime-factorization problem can produce many different
trees for the same number. Can you write a program which would COUNT the
number of different trees for a given number?