2 #Time-stamp: "2001-02-23 20:09:47 MST" -*-Text-*-
3 # This document contains text in Perl "POD" format.
4 # Use a POD viewer like perldoc or perlman to render it.
8 HTML::Tree::AboutTrees -- article on tree-shaped data structures in Perl
12 # This an article, not a module.
16 The following article by Sean M. Burke first appeared in I<The Perl
17 Journal> #18 and is copyright 2000 The Perl Journal. It appears
18 courtesy of Jon Orwant and The Perl Journal. This document may be
19 distributed under the same terms as Perl itself.
27 "AaaAAAaauugh! Watch out for that tree!"
28 -- I<George of the Jungle theme>
32 Perl's facility with references, combined with its automatic management of
33 memory allocation, makes it straightforward to write programs that store data
34 in structures of arbitrary form and complexity.
36 But I've noticed that many programmers, especially those who started out
37 with more restrictive languages, seem at home with complex but uniform
38 data structures -- N-dimensional arrays, or more struct-like things like
39 hashes-of-arrays(-of-hashes(-of-hashes), etc.) -- but they're often uneasy
40 with buliding more freeform, less tabular structures, like
41 tree-shaped data structures.
43 But trees are easy to build and manage in Perl, as I'll demonstrate
44 by showing off how the HTML::Element class manages elements in an HTML
45 document tree, and by walking you through a from-scratch implementation
46 of game trees. But first we need to nail down what we mean by a "tree".
48 =head2 Socratic Dialogues: "What is a Tree?"
50 My first brush with tree-shaped structures was in linguistics classes,
51 where tree diagrams are used to describe the syntax underlying natural
52 language sentences. After learning my way around I<those> trees, I
53 started to wonder -- are what I'm used to calling "trees" the same as what
54 programmers call "trees"? So I asked lots of helpful and patient
55 programmers how they would define a tree. Many replied with a
56 answer in jargon that they could not really explain (understandable,
57 since explaining things, especially defining things, is harder
62 -- So what I<is> a "tree", a tree-shaped data structure?
64 -- A tree is a special case of an acyclic directed graph!
68 -- Um... lines... and... you draw it... with... arcs! nodes! um...
72 The most helpful were folks who couldn't explain directly, but with
73 whom I could get into a rather Socratic dialog (where I<I> asked the
74 half-dim half-earnest questions), often with much doodling of
77 Question: so what's a tree?
79 Answer: A tree is a collection of nodes that are linked together in a,
80 well, tree-like way! Like this I<[drawing on a napkin]:>
88 Q: So what do these letters represent?
90 A: Each is a different node, a bunch of data. Maybe C is a
91 bunch of data that stores a number, maybe a hash table, maybe nothing
92 at all besides the fact that it links to D, E, and F (which are other
95 Q: So what're the lines between the nodes?
97 A: Links. Also called "arcs". They just symbolize the fact that each
98 node holds a list of nodes it links to.
100 Q: So what if I draw nodes and links, like this...
108 Is that still a tree?
110 A: No, not at all. There's a lot of un-treelike things about that.
111 First off, E has a link coming off of it going into nowhere. You can't have
112 a link to nothing -- you can only link to another node. Second off, I
113 don't know what that sideways link between B and E means...
115 Q: Okay, let's work our way up from something simpler. Is this a tree...?
119 A: Yes, I suppose. It's a tree of just one node.
127 A: No, you can't just have nodes floating there, unattached.
129 Q: Okay, I'll link A and B. How's this?
135 A: Yup, that's a tree. There's a node A, and a node B, and they're linked.
137 Q: How is that tree any different from this one...?
143 A: Well, in both cases A and B are linked. But it's in a different
146 Q: Direction? What does the direction mean?
148 A: Well, it depends what the tree represents. If it represents a
149 categorization, like this:
153 orange lemon kumquat ...
155 then you mean to say that oranges, lemons, kumquats, etc., are a kind of
156 citrus. But if you drew it upside down, you'd be saying, falsely, that
157 citrus is a kind of kumquat, a kind of lemon, and a kind of orange.
158 If the tree represented cause-and-effect (or at least what situations
159 could follow others), or represented what's a part of what, you
160 wouldn't want to get those backwards, either. So with the nodes you
161 draw together on paper, one has to be over the other, so you can tell which
162 way the relationship in the tree works.
164 Q: So are these two trees the same?
171 A: Yes, although by convention we often try to line up things in the
172 same generation, like it is in the diagram on the left.
174 Q: "generation"? This is a family tree?
176 A: No, not unless it's a family tree for just yeast cells or something
177 else that reproduces asexually.
178 But for sake of having lots of terms to use, we just pretend that links
179 in the tree represent the "is a child of" relationship, instead of "is a
180 kind of" or "is a part of", or "could result from", or whatever the real
181 relationship is. So we get to borrow a lot of kinship words for
182 describing trees -- B and C are "children" (or "daughters") of A; A is
183 the "parent" (or "mother") of B and C. Node C is a "sibling" (or
184 "sister") of node C; and so on, with terms like "descedants" (a node's
185 children, children's children, etc.), and "generation" (all the
186 nodes at the same "level" in the tree, i.e., are either all
187 grandchildren of the top node, or all great-grand-children, etc.), and
188 "lineage" or "ancestors" (parents, and parent's parents, etc., all the
189 way to the topmost node).
191 So then we get to express rules in terms like "B<A node cannot have more
192 than one parent>", which means that this is not a valid tree:
200 And: "B<A node can't be its own parent>", which excludes this looped-up
207 Or, put more generally: "B<A node can't be its own ancestor>", which
208 excludes the above loop, as well as the one here:
218 That tree is excluded because A is a child of Z, and Z is a child of C,
219 and C is a child of A, which means A is its own great-grandparent. So
220 this whole network can't be a tree, because it breaks the sort of
221 meta-rule: B<once any node in the supposed tree breaks the rules for
222 trees, you don't have a tree anymore.>
224 Q: Okay, now, are these two trees the same?
230 A: It depends whether you're basing your concept of trees on each node
231 having a set (unordered list) of children, or an (ordered) list of
232 children. It's a question of whether ordering is important for what
233 you're doing. With my diagram of citrus types, ordering isn't
234 important, so these tree diagrams express the same thing:
244 because it doesn't make sense to say that oranges are "before" or
245 "after" kumquats in the whole botanical scheme of things. (Unless, of
246 course, you I<are> using ordering to mean something, like a degree of
249 But consider a tree that's a diagram of what steps are comprised in an
250 activity, to some degree of specificity:
261 This means that making tea consists of putting hot water in a cup or
262 put, infusing it (which itself consists of adding tea leaves and letting
263 it sit), then serving it -- I<in that order>. If you serve an empty
264 dry pot (sipping from empty cups, etc.), let it sit, add tea leaves,
265 and pour in hot water, then what you're doing is performance art, not
278 Except for my having renamed the root, this tree is the same as
279 the making-tea tree as far as what's under what, but it differs
280 in order, and what the tree means makes the order important.
282 Q: Wait -- "root"? What's a root?
284 A: Besides kinship terms like "mother" and "daugher", the jargon for
285 tree parts also has terms from real-life tree parts: the part that
286 everything else grows from is called the root; and nodes that don't
287 have nodes attached to them (i.e., childless nodes) are called
290 Q: But you've been drawing all your trees with the root at the top and
291 leaves at the bottom.
293 A: Yes, but for some reason, that's the way everyone seems to think of
294 trees. They can draw trees as above; or they can draw them sort of
295 sideways with indenting representing what nodes are children of what:
298 * pour hot water in cup/pot
304 ...but folks almost never seem to draw trees with the root at the
305 bottom. So imagine it's based on spider plant in a hanging pot.
306 Unfortunately, spider plants I<aren't> botanically trees, they're
307 plants; but "spider plant diagram" is rather a mouthful, so let's just
310 =head2 Trees Defined Formally
312 In time, I digested all these assorted facts about programmers' ideas of
313 trees (which turned out to be just a more general case of linguistic
314 ideas of trees) into a single rule:
316 * A node is an item that contains ("is over", "is parent of", etc.)
317 zero or more other nodes.
319 From this you can build up formal definitions for useful terms, like so:
321 * A node's B<descendants> are defined as all its children, and all
322 their children, and so on. Or, stated recursively: a node's
323 descendants are all its children, and all its children's descendants.
324 (And if it has no children, it has no descendants.)
326 * A node's B<ancestors> consist of its parent, and its parent's
327 parent, etc, up to the root. Or, recursively: a node's ancestors
328 consist of its parent and its parent's ancestors. (If it has no parent,
329 it has no ancestors.)
331 * A B<tree> is a root node and all the root's descendants.
333 And you can add a proviso or two to clarify exactly what I impute to the
334 word "other" in "other nodes":
336 * A node cannot contain itself, or contain any node that contains it,
337 etc. Looking at it the other way: a node cannot be its own parent or
340 * A node can be root (i.e., no other node contains it) or can be
341 contained by only one parent; no node can be the child of two or more
344 Add to this the idea that children are sometimes ordered, and sometimes
345 not, and that's about all you need to know about defining what a tree
346 is. From there it's a matter of using them.
348 =head2 Markup Language Trees: HTML-Tree
350 While not I<all> markup languages are inherently tree-like, the
351 best-known family of markup languages, HTML, SGML, and XML, are about
352 as tree-like as you can get. In these languages, a document consists
353 of elements and character data in a tree structure where
354 there is one root element, and elements can contain either other
355 elements, or character data.
360 For sake of simplicity, I'm glossing over
361 comments (<!-- ... -->), processing instructions (<?xml
362 version='1.0'>), and declarations (<!ELEMENT ...>, <!DOCTYPE ...>).
363 And I'm not bothering to distinguish entity references
364 (<, @) or CDATA sections (<![CDATA[ ...]]>) from normal text.
368 For example, consider this HTML document:
376 <body bgcolor="#d010ff">
385 I've indented this to point out what nodes (elements or text items) are
386 children of what, with each node on a line of its own.
388 The HTML::TreeBuilder module (in the CPAN distribution HTML-Tree)
389 does the work of taking HTML source and
390 building in memory the tree that the document source represents.
394 Footnote: it requires the HTML::Parser module, which tokenizes the
395 source -- i.e., identifies each tag, bit of text, comment, etc.
399 The trees structures that it builds represent bits of text with
400 normal Perl scalar string values; but elements are represented with
401 objects -- that is, chunks of data that belong to a
402 class (in this case, HTML::Element), a class that provides methods
403 (routines) for accessing the pieces of data in each element, and
404 otherwise doing things with elements. (See my article in TPJ#17 for a
405 quick explanation of objects, the POD document C<perltoot> for a longer
406 explanation, or Damian Conway's excellent book I<Object-Oriented Perl>
409 Each HTML::Element object contains a number of pieces of data:
411 * its element name ("html", "h1", etc., accessed as $element->tag)
413 * a list of elements (or text segments) that it contains, if any
414 (accessed as $element->content_list or $element->content, depending on
415 whether you want a list, or an arrayref)
417 * what element, if any, contains it (accessed as $element->parent)
419 * and any SGML attributes that the element has,
420 such as C<lang="en-US">, C<align="center">, etc. (accessed as
421 $element->attr('lang'), $element->attr('center'), etc.)
423 So, for example, when HTML::TreeBuilder builds the tree for the above
424 HTML document source, the object for the "body" element has these pieces of
427 * element name: "body"
429 the string "I've got "
430 the object for the "em" element
433 the object for the "html" element
436 Now, once you have this tree of objects, almost anything you'd want to
437 do with it starts with searching the tree for some bit of information
440 Accessing a piece of information in, say, a hash of hashes of hashes,
443 $password{'sean'}{'sburke1'}{'hpux'}
445 because you know that all data points in that structure are accessible
446 with that syntax, but with just different keys. Now, the "em" element
447 in the above HTML tree does happen to be accessible
448 as the root's child #1's child #1:
450 $root->content->[1]->content->[1]
452 But with trees, you typically don't know the exact location (via
453 indexes) of the data you're looking for. Instead, finding what you want
454 will typically involve searching through the tree, seeing if every node is
455 the kind you want. Searching the whole tree is simple enough -- look at
456 a given node, and if it's not what you want, look at its children, and
457 so on. HTML-Tree provides several methods that do this for you, such as
458 C<find_by_tag_name>, which returns the elements (or the first element, if
459 called in scalar context) under a given node (typically the root) whose
460 tag name is whatever you specify.
462 For example, that "em" node can be found as:
464 my $that_em = $root->find_by_tag_name('em');
468 @ems = $root->find_by_tag_name('em');
469 # will only have one element for this particular tree
471 Now, given an HTML document of whatever structure and complexity, if you
472 wanted to do something like change every
476 E<lt>emE<gt>I<stuff>E<lt>/emE<gt>
484 E<lt>em class="funky"E<gt>
485 B<E<lt>bE<gt>[-E<lt>/bE<gt>>
487 B<E<lt>bE<gt>-]E<lt>/bE<gt>>
492 the first step is to frame this operation in terms of what you're doing
493 to the tree. You're changing this:
507 In other words, you're finding all elements whose tag name is "em",
508 setting its class attribute to "funky", and adding one child to the start
509 of its content list -- a new "b" element
510 whose content is the text string "[-" -- and one to the end of its
511 content list -- a new "b" element whose content is the text string "-]".
513 Once you've got it in these terms, it's just a matter of running to the
514 HTML::Element documentation, and coding this up with calls to the
515 appropriate methods, like so:
517 use HTML::Element 1.53;
518 use HTML::TreeBuilder 2.96;
519 # Build the tree by parsing the document
520 my $root = HTML::TreeBuilder->new;
521 $root->parse_file('whatever.html'); # source file
523 # Now make new nodes where needed
524 foreach my $em ($root->find_by_tag_name('em')) {
525 $em->attr('class', 'funky'); # Set that attribute
527 # Make the two new B nodes
528 my $new1 = HTML::Element->new('b');
529 my $new2 = HTML::Element->new('b');
530 # Give them content (they have none at first)
531 $new1->push_content('[-');
532 $new2->push_content('-]');
534 # And put 'em in place!
535 $em->unshift_content($new1);
536 $em->push_content($new2);
539 "<!-- Looky see what I did! -->\n",
540 $root->as_HTML(), "\n";
542 The class HTML::Element provides just about every method I can image you
543 needing, for manipulating trees made of HTML::Element objects. (And
544 what it doesn't directly provide, it will give you the components to build
547 =head2 Building Your Own Trees
549 Theoretically, any tree is pretty much like any other tree, so you could
550 use HTML::Element for anything you'd ever want to do with tree-arranged
551 objects. However, as its name implies, HTML::Element is basically
552 I<for> HTML elements; it has lots of features that make sense only for
553 HTML elements (like the idea that every element must have a tag-name).
554 And it lacks some features that might be useful for general applications
555 -- such as any sort of checking to make sure that you're not trying to
556 arrange objects in a non-treelike way. For a general-purpose tree class
557 that does have such features, you can use Tree::DAG_Node, also available
560 However, if your task is simple enough, you might find it overkill to
561 bother using Tree::DAG_Node. And, in any case, I find that the best
562 way to learn how something works is to implement it (or something like
563 it, but simpler) yourself. So I'll here discuss how you'd implement a tree
564 structure, I<without> using any of the existing classes for tree nodes.
566 =head2 Implementation: Game Trees for Alak
568 Suppose that the task at hand is to write a program that can play
569 against a human opponent at a strategic board game (as opposed to a
570 board game where there's an element of chance). For most such games, a
571 "game tree" is an essential part of the program (as I will argue,
572 below), and this will be our test case for implementing a tree
573 structure from stratch.
575 For sake of simplicity, our game is not chess or backgammon, but instead
576 a much simpler game called Alak. Alak was invented by the mathematician
577 A. K. Dewdney, and described in his 1984 book I<Planiverse>. The rules
582 Footnote: Actually, I'm describing only my
583 interpetation of the rules Dewdney describes in I<Planiverse>. Many
584 other interpretations are possible.
588 * Alak is a two-player game played on a one-dimensional board with
589 eleven slots on it. Each slot can hold at most one piece at a time.
590 There's two kinds of pieces, which I represent here as "x" and "o" --
591 x's belong to one player (called X), o's to the other (called O).
593 * The initial configuration of the board is:
597 For sake of the article, the slots are numbered from 1 (on the left) to
598 11 (on the right), and X always has the first move.
600 * The players take turns moving. At each turn, each player can move
601 only one piece, once. (This unlike checkers, where you move one piece
602 per move but get to keep moving it if you jump an your opponent's
603 piece.) A player cannot pass up on his turn. A player can move any one
604 of his pieces to the next unoccupied slot to its right or left, which
605 may involve jumping over occupied slots. A player cannot move a piece
606 off the side of the board.
608 * If a move creates a pattern where the opponent's pieces are
609 surrounded, on both sides, by two pieces of the mover's color (with no
610 intervening unoccupied blank slot), then those surrounded pieces are
611 removed from the board.
613 * The goal of the game is to remove all of your opponent's pieces, at
614 which point the game ends. Removing all-but-one ends the game as
615 well, since the opponent can't surround you with one piece, and so will
616 always lose within a few moves anyway.
618 Consider, then, this rather short game where X starts:
621 ^ Move 1: X moves from 3 (shown with caret) to 5
622 (Note that any of X's pieces could move, but
623 that the only place they could move to is 5.)
625 ^ Move 2: O moves from 9 to 7.
627 ^ Move 3: X moves from 4 to 6.
629 ^ Move 4: O (stupidly) moves from 10 to 9.
631 ^ Move 5: X moves from 5 to 10, making the board
632 "xx___xoooxo". The three o's that X just
633 surrounded are removed.
635 O has only one piece, so has lost.
637 Now, move 4 could have gone quite the other way:
640 Move 4: O moves from 8 to 4, making the board
641 "xx_oxxo__oo". The surrounded x's are removed.
643 ^ Move 5: X moves from 1 to 2.
645 ^ Move 6: O moves from 7 to 6.
647 ^ Move 7: X moves from 2 to 5, removing the o at 4.
651 To teach a computer program to play Alak (as player X, say), it needs to
652 be able to look at the configuration of the board, figure out what moves
653 it can make, and weigh the benefit or costs, immediate or eventual, of
656 So consider the board from just before move 3, and figure all the possible
657 moves X could make. X has pieces in slots 1, 2, 4, and 5. The leftmost
658 two x's (at 1 and 2) are up against the end of the board, so they
659 can move only right. The other two x's (at 4 and 5) can move either
662 Starting board: xx_xx_oo_oo
663 moving 1 to 3 gives _xxxx_oo_oo
664 moving 2 to 3 gives x_xxx_oo_oo
665 moving 4 to 3 gives xxx_x_oo_oo
666 moving 5 to 3 gives xxxx__oo_oo
667 moving 4 to 6 gives xx__xxoo_oo
668 moving 5 to 6 gives xx_x_xoo_oo
670 For the computer to decide which of these is the best move to make, it
671 needs to quantify the benefit of these moves as a number -- call that
672 the "payoff". The payoff of a move can be figured as just the number
673 of x pieces removed by the most recent move, minus the nubmer of o
674 pieces removed by the most recent move. (It so happens that the rules
675 of the game mean that no move can delete both o's and x's, but the
676 formula still applies.) Since none of these moves removed any pieces,
677 all these moves have the same immediate payoff: 0.
679 Now, we could race ahead and write an Alak-playing program that could
680 use the immediate payoff to decide which is the best move to make.
681 And when there's more than one best move (as here, where all the moves
682 are equally good), it could choose randomly between the good
683 alternatives. This strategy is simple to implement; but it makes for a
684 very dumb program. Consider what O's response to each of the potential
685 moves (above) could be. Nothing immediately suggests itself for the
686 first four possibilities (X having moved something to position 3), but
687 either of the last two (illustrated below) are pretty perilous,
688 because in either case O has the obvious option (which he would be
689 foolish to pass up) of removing x's from the board:
694 ^ O moves 8 to 4, giving "xx_oxxo__oo". The two
695 surrounded x's are removed.
703 ^ O moves 8 to 5, giving "xx_xoxo__oo". The one
704 surrounded x is removed.
707 Both contingencies are quite bad for X -- but this is not captured
708 by the fact that they start out with X thinking his move will be
709 harmless, having a payoff of zero.
711 So what's needed is for X to think I<more> than one step ahead -- to
712 consider not merely what it can do in this move, and what the payoff
713 is, but to consider what O might do in response, and the
714 payoff of those potential moves, and so on with X's possible responses
715 to those cases could be. All these possibilities form a game tree -- a
716 tree where each node is a board, and its children are successors of
717 that node -- i.e., the boards that could result from every move
718 possible, given the parent's board.
720 But how to represent the tree, and how to represent the nodes?
722 Well, consider that a node holds several pieces of data:
724 1) the configuration of the board, which, being nice and simple and
725 one-dimensional, can be stored as just a string, like "xx_xx_oo_oo".
727 2) whose turn it is, X or O. (Or: who moved last, from which we can
728 figure whose turn it is).
730 3) the successors (child nodes).
732 4) the immediate payoff of having moved to this board position from its
733 predecessor (parent node).
735 5) and what move gets us from our predecessor node to here. (Granted,
736 knowing the board configuration before and after the move, it's easy to
737 figure out the move; but it's easier still to store it as one is
738 figuring out a node's successors.)
740 6) whatever else we might want to add later.
742 These could be stored equally well in an array or in a hash, but it's my
743 experience that hashes are best for cases where you have more than just
744 two or three bits of data, or especially when you might need to add new
745 bits of data. Moreover, hash key names are mnemonic --
746 $node->{'last_move_payoff'} is plain as day, whereas it's not so easy having to
747 remember with an array that $node->[3] is where you decided to keep the
753 Of course, there are ways around that problem: just swear you'll never
754 use a real numeric index to access data in the array, and instead use
755 constants with mnemonic names:
758 use constant idx_PAYOFF => 3;
762 Or use a pseudohash. But I prefer to keep it simple, and use a hash.
764 These are, incidentally, the same arguments that
765 people weigh when trying to decide whether their object-oriented
766 modules should be based on blessed hashes, blessed arrays, or what.
767 Essentially the only difference here is that we're not blessing our
768 nodes or talking in terms of classes and methods.
774 So, we might as well represent nodes like so:
777 'board' => ...board string, e.g., "xx_x_xoo_oo"
779 'last_move_payoff' => ...payoff of the move
782 'last_move_from' => ...the start...
783 'last_move_to' => ...and end point of the move
784 that got us here. E.g., 5 and 6,
785 representing a move from 5 to 6.
787 'whose_turn' => ...whose move it then becomes.
790 'successors' => ...the successors
793 Note that we could have a field called something like 'last_move_who' to
794 denote who last moved, but since turns in Alak always alternate (and
795 no-one can pass), storing whose move it is now I<and> who last moved is
796 redundant -- if X last moved, it's O turn now, and vice versa.
797 I chose to have a 'whose_turn' field instead of a 'last_move_who', but
798 it doesn't really matter. Either way, we'll end up inferring one from
799 the other at several points in the program.
801 When we want to store the successors of a node, should we use an array
802 or a hash? On the one hand, the successors to $node aren't essentially
803 ordered, so there's no reason to use an array per se; on the other hand,
804 if we used a hash, with successor nodes as values, we don't have
805 anything particularly meaningful to use as keys. (And we can't use the
806 successors themselves as keys, since the nodes are referred to by
807 hash references, and you can't use a reference as a hash key.) Given no
808 particularly compelling reason to do otherwise, I choose to just use an
809 array to store all a node's successors, although the order is never
810 actually used for anything:
814 'successors' => [ ...nodes... ],
818 In any case, now that we've settled on what should be in a node,
819 let's make a little sample tree out of a few nodes and see what we can
822 # Board just before move 3 in above game
824 'board' => 'xx_xx_oo_oo',
825 'last_move_payoff' => 0,
826 'last_move_from' => 9,
832 # And, for now, just two of the successors:
834 # X moves 4 to 6, giving xx__xxoo_oo
836 'board' => 'xx__xxoo_oo',
837 'last_move_payoff' => 0,
838 'last_move_from' => 4,
844 # or X moves 5 to 6, giving xx_x_xoo_oo
846 'board' => 'xx_x_xoo_oo',
847 'last_move_payoff' => 0,
848 'last_move_from' => 5,
854 # Now connect them...
855 push @{$n0->{'successors'}}, $n1, $n2;
857 =head2 Digression: Links to Parents
859 In comparing what we store in an Alak game tree node to what
860 HTML::Element stores in HTML element nodes, you'll note one big
861 difference: every HTML::Element node contains a link to its parent,
862 whereas we don't have our Alak nodes keeping a link to theirs.
864 The reason this can be an important difference is because it can affect
865 how Perl knows when you're not using pieces of memory anymore.
866 Consider the tree we just built, above:
872 There's two ways Perl knows you're using a piece of memory:
873 1) it's memory that belongs directly to a variable (i.e., is necessary
874 to hold that variable's value, or valueI<s> in the case of a hash or
875 array), or 2) it's a piece of memory that something holds a reference
876 to. In the above code, Perl knows that the hash for node 0 (for board
877 "xx_xx_oo_oo") is in use because something (namely, the variable
878 C<$n0>) holds a reference to it. Now, even if you followed the above
881 $n1 = $n2 = 'whatever';
883 to make your variables C<$n1> and C<$n2> stop holding references to
884 the hashes for the two successors of node 0, Perl would still know that
885 those hashes are still in use, because node 0's successors array holds
886 a reference to those hashes. And Perl knows that node 0 is still in
887 use because something still holds a reference to it. Now, if you
892 This would change nothing -- there's just be I<two> things holding a
893 reference to the node 0 hash, which in turn holds a reference to the
894 node 1 and node 2 hashes. And if you then added:
898 still nothing would change, because something (C<$root>) still holds a
899 reference to the node 0 hash. But once I<nothing> holds a reference to
900 the node 0 hash, Perl will know it can destroy that hash (and reclaim
901 the memory for later use, say), and once it does that, nothing will hold
902 a reference to the node 1 or the node 2 hashes, and those will be
905 But consider if the node 1 and node 2 hashes each had an attribute
906 "parent" (or "predecessor") that held a reference to node 0. If your
907 program stopped holding a reference to the node 0 hash, Perl could
908 I<not> then say that I<nothing> holds a reference to node 0 -- because
909 node 1 and node 2 still do. So, the memory for nodes 0, 1, and 2 would
910 never get reclaimed (until your program ended, at which point Perl
911 destroys I<everything>). If your program grew and discarded lots of
912 nodes in the game tree, but didn't let Perl know it could reclaim their
913 memory, your program could grow to use immense amounts of memory --
914 never a nice thing to have happen. There's three ways around this:
916 1) When you're finished with a node, delete the reference each of its
917 children have to it (in this case, deleting $n1->{'parent'}, say).
918 When you're finished with a whole tree, just go through the whole tree
919 erasing links that children have to their children.
921 2) Reconsider whether you really need to have each node hold a reference
922 to its parent. Just not having those links will avoid the whole
925 3) use the WeakRef module with Perl 5.6 or later. This allows you to
926 "weaken" some references (like the references that node 1 and 2 could
927 hold to their parent) so that they don't count when Perl goes asking
928 whether anything holds a reference to a given piece of memory. This
929 wonderful new module eliminates the headaches that can often crop up
930 with either of the two previous methods.
932 It so happens that our Alak program is simple enough that we don't need
933 for our nodes to have links to their parents, so the second solution is
934 fine. But in a more advanced program, the first or third solutions
935 might be unavoidable.
937 =head2 Recursively Printing the Tree
939 I don't like working blind -- if I have any kind of a complex data
940 structure in memory for a program I'm working on, the first thing I do
941 is write something that can dump that structure to the screen so I can
942 make sure that what I I<think> is in memory really I<is> what's in
943 memory. Now, I could just use the "x" pretty-printer command in Perl's
944 interactive debugger, or I could have the program use the
945 C<Data::Dumper> module. But in this case, I think the output from those
946 is rather too verbose. Once we have trees with dozens of nodes in them,
947 we'll really want a dump of the tree to be as concise as possible,
948 hopefully just one line per node. What I'd like is something that can
949 print C<$n0> and its successors (see above) as something like:
951 xx_xx_oo_oo (O moved 9 to 7, 0 payoff)
952 xx__xxoo_oo (X moved 4 to 6, 0 payoff)
953 xx_x_xoo_oo (X moved 5 to 6, 0 payoff)
955 A subroutine to print a line for a given node, and then do that again for
956 each successor, would look something like:
959 my $n = $_[0]; # "n" is for node
961 ...something expressing $n'n content...
962 foreach my $s (@{$n->{'successors'}}) {
968 And we could just start that out with a call to C<dump_tree($n0)>.
970 Since this routine...
975 I first wrote this routine starting out with "sub dump {". But when
976 I tried actually calling C<dump($n0)>, Perl would dump core! Imagine
977 my shock when I discovered that this is absolutely to be expected --
978 Perl provides a built-in function called C<dump>, the purpose of which
979 is to, yes, make Perl dump core. Calling our routine "dump_tree"
980 instead of "dump" neatly avoids that problem.
984 ...does its work (dumping the subtree at and under the
985 given node) by calling itself, it's B<recursive>. However, there's a
986 special term for this kind of recursion across a tree: traversal. To
987 B<traverse> a tree means to do something to a node, and to traverse its
988 children. There's two prototypical ways to do this, depending on what
991 traversing X in pre-order:
993 * then traverse X's children
995 traversing X in post-order:
996 * traverse X's children
997 * then do something to X
999 Dumping the tree to the screen the way we want it happens to be a matter
1000 of pre-order traversal, since the thing we do (print a description of
1001 the node) happens before we recurse into the successors.
1003 When we try writing the C<print> statement for our above C<dump_tree>,
1004 we can get something like:
1009 # "xx_xx_oo_oo (O moved 9 to 7, 0 payoff)"
1011 $n->{'board'}, " (",
1012 ($n->{'whose_turn'} eq 'o' ? 'X' : 'O'),
1013 # Infer who last moved from whose turn it is now.
1014 " moved ", $n->{'last_move_from'},
1015 " to ", $n->{'last_move_to'},
1016 ", ", $n->{'last_move_payoff'},
1020 foreach my $s (@{$n->{'successors'}}) {
1025 If we run this on $n0 from above, we get this:
1027 xx_xx_oo_oo (O moved 9 to 7, 0 payoff)
1028 xx__xxoo_oo (X moved 4 to 6, 0 payoff)
1029 xx_x_xoo_oo (X moved 5 to 6, 0 payoff)
1031 Each line on its own is fine, but we forget to allow for indenting, and
1032 without that we can't tell what's a child of what. (Imagine if the
1033 first successor had successors of its own -- you wouldn't be able to
1034 tell if it were a child, or a sibling.) To get indenting, we'll need
1035 to have the instances of the C<dump_tree> routine know how far down in
1036 the tree they're being called, by passing a depth parameter between
1042 $depth = 0 unless defined $depth;
1046 foreach my $s (@{$n->{'successors'}}) {
1047 dump_tree($s, $depth + 1);
1051 When we call C<dump_tree($n0)>, C<$depth> (from C<$_[1]>) is undefined, so
1052 gets set to 0, which translates into an indenting of no spaces. But when
1053 C<dump_tree> invokes itself on C<$n0>'s children, those instances see
1054 C<$depth> + 1 as their C<$_[1]>, giving appropriate indenting.
1059 Passing values around between different invocations of a recursive
1060 routine, as shown, is a decent way to share the data. Another way
1061 to share the data is by keeping it in a global variable, like C<$Depth>,
1062 initially set to 0. Each time C<dump_tree> is about to recurse, it must
1063 C<++$Depth>, and when it's back, it must C<--$Depth>.
1065 Or, if the reader is familiar with closures, consider this approach:
1068 # A wrapper around calls to a recursive closure:
1069 my $start_node = $_[0];
1071 # to be shared across calls to $recursor.
1078 foreach my $s (@{$n->{'successors'}}) {
1083 $recursor->($start_node); # start recursing
1087 The reader with an advanced understanding of Perl's reference-count-based
1088 garbage collection is invited to consider why it is currently necessary
1089 to undef $recursor (or otherwise change its value) after all recursion
1092 The reader whose mind is perverse in other ways is invited to consider
1093 how (or when!) passing a depth parameter around is unnecessary because
1094 of information that Perl's C<caller(N)> function reports!
1100 =head2 Growing the Tree
1102 Our C<dump_tree> routine works fine for the sample tree we've got, so
1103 now we should get the program working on making its own trees, starting
1106 In C<Games::Alak> (the CPAN-released version of Alak that uses
1107 essentially the same code that we're currently discussing the
1108 tree-related parts of), there is a routine called C<figure_successors>
1109 that, given one childless node, will figure out all its possible
1110 successors. That is, it looks at the current board, looks at every piece
1111 belonging to the player whose turn it is, and considers the effect of
1112 moving each piece every possible way -- notably, it figures out the
1113 immediate payoff, and if that move would end the game, it notes that by
1114 setting an "endgame" entry in that node's hash. (That way, we know that
1115 that's a node that I<can't> have successors.)
1117 In the code for C<Games::Alak>, C<figure_successors> does all these things,
1118 in a rather straightforward way. I won't walk you through the details
1119 of the C<figure_successors> code I've written, since the code has
1120 nothing much to do with trees, and is all just implementation of the Alak
1121 rules for what can move where, with what result. Espicially interested
1122 readers can puzzle over that part of code in the source listing in the
1123 archive from CPAN, but others can just assume that it works as described
1126 But consider that C<figure_successors>, regardless of its inner
1127 workings, does not grow the I<tree>; it only makes one set of successors
1128 for one node at a time. It has to be up to a different routine to call
1129 C<figure_successors>, and to keep applying it as needed, in order to
1130 make a nice big tree that our game-playing program can base its
1133 Now, we could do this by just starting from one node, applying
1134 C<figure_successors> to it, then applying C<figure_successors> on all
1135 the resulting children, and so on:
1137 sub grow { # Just a first attempt at this!
1139 figure_successors($n);
1141 @{$n->{'successors'}}
1142 # already has successors.
1144 # can't have successors.
1146 foreach my $s (@{$n->{'successors'}}) {
1151 If you have a game tree for tic-tac-toe, and you grow it without
1152 limitation (as above), you will soon enough have a fully "solved" tree,
1153 where every node that I<can> have successors I<does>, and all the leaves
1154 of the tree are I<all> the possible endgames (where, in each case, the
1155 board is filled). But a game of Alak is different from tic-tac-toe,
1156 because it can, in theory, go on forever. For example, the following
1157 sequence of moves is quite possible:
1162 xxxx__o_ooo (x moved back)
1163 xxxx___oooo (o moved back)
1164 ...repeat forever...
1166 So if you tried using our above attempt at a C<grow> routine, Perl would
1167 happily start trying to construct an infinitely deep tree, containing
1168 an infinite number of nodes, consuming an infinite amount of memory, and
1169 requiring an infinite amount of time. As the old saying goes: "You
1170 can't have everything -- where would you put it?" So we have to place
1171 limits on how much we'll grow the tree.
1173 There's more than one way to do this:
1175 1. We could grow the tree until we hit some limit on the number of
1176 nodes we'll allow in the tree.
1178 2. We could grow the tree until we hit some limit on the amount of time
1179 we're willing to spend.
1181 3. Or we could grow the tree until it is fully fleshed out to a certain
1184 Since we already know to track depth (as we did in writing C<dump_tree>),
1185 we'll do it that way, the third way. The implementation for that third
1186 approach is also pretty straightforward:
1191 my $depth = $_[1] || 0;
1192 figure_successors($n)
1194 $depth >= $Max_depth
1195 or @{$n->{'successors'}}
1198 foreach my $s (@{$n->{'successors'}}) {
1199 grow($s, $depth + 1);
1201 # If we're at $Max_depth, then figure_successors
1202 # didn't get called, so there's no successors
1203 # to recurse under -- that's what stops recursion.
1206 If we start from a single node (whether it's a node for the starting board
1207 "xxxx___oooo", or for whatever board the computer is faced with), set
1208 C<$Max_depth> to 4, and apply C<grow> to it, it will grow the tree to
1209 include several hundred nodes.
1214 If at each move there are four pieces that can move, and they can each
1215 move right or left, the "branching factor" of the tree is eight, giving
1216 a tree with 1 (depth 0) + 8 (depth 1) + 8 ** 2 + 8 ** 3 + 8 ** 4 =
1217 4681 nodes in it. But, in practice, not all pieces can move in both
1218 directions (none of the x pieces in "xxxx___oooo" can move left, for
1219 example), and there may be fewer than four pieces, if some were lost.
1220 For example, there are 801 nodes in a tree of depth four starting
1221 from "xxxx___oooo", suggesting an average branching factor of about
1222 five (801 ** (1/4) is about 5.3), not eight.
1226 What we need to derive from that tree is the information about what
1227 are the best moves for X. The simplest way to consider the payoff of
1228 different successors is to just average them -- but what we average
1229 isn't always their immediate payoffs (because that'd leave us using
1230 only one generation of information), but the average payoff of I<their>
1231 successors, if any. We can formalize this as:
1233 To figure a node's average payoff:
1234 If the node has successors:
1235 Figure each successor's average payoff.
1236 My average payoff is the average of theirs.
1238 My average payoff is my immediate payoff.
1240 Since this involves recursing into the successors I<before> doing
1241 anything with the current node, this will traverse the tree
1244 We could work that up as a routine of its own, and apply that to the
1245 tree after we've applied C<grow> to it. But since we'd never
1246 grow the tree without also figuring the average benefit, we might as well
1247 make that figuring part of the C<grow> routine itself:
1252 my $depth = $_[1] || 0;
1253 figure_successors($n);
1255 $depth >= $Max_depth
1256 or @{$n->{'successors'}}
1260 if(@{$n->{'successors'}}) {
1261 my $a_payoff_sum = 0;
1262 foreach my $s (@{$n->{'successors'}}) {
1263 grow($s, $depth + 1); # RECURSE
1264 $a_payoff_sum += $s->{'average_payoff'};
1266 $n->{'average_payoff'}
1267 = $a_payoff_sum / @{$n->{'successors'}};
1269 $n->{'average_payoff'}
1270 = $n->{'last_move_payoff'};
1274 So, by time C<grow> has applied to a node (wherever in the tree it is),
1275 it will have figured successors if possible (which, in turn, sets
1276 C<last_move_payoff> for each node it creates), and will have set
1279 Beyond this, all that's needed is to start the board out with a root
1280 note of "xxxx___oooo", and have the computer (X) take turns with the
1281 user (O) until someone wins. Whenever it's O's turn, C<Games::Alak>
1282 presents a prompt to the user, letting him know the state of the current
1283 board, and asking what move he selects. When it's X's turn, the
1284 computer grows the game tree as necessary (using just the C<grow>
1285 routine from above), then selects the move with the highest average
1286 payoff (or one of the highest, in case of a tie).
1288 In either case, "selecting" a move means just setting that move's node
1289 as the new root of the program's game tree. Its sibling nodes and their
1290 descendants (the boards that I<didn't> get selected) and its parent node
1291 will be erased from memory, since they will no longer be in use (as Perl
1292 can tell by the fact that nothing holds references to them anymore).
1294 The interface code in C<Games::Alak> (the code that prompts the user for
1295 his move) actually supports quite a few options besides just moving --
1296 including dumping the game tree to a specified depth (using a slightly
1297 fancier version of C<dump_tree>, above), resetting the game, changing
1298 C<$Max_depth> in the middle of the game, and quitting the game. Like
1299 C<figure_successors>, it's a bit too long to print here, but interested
1300 users are welcome to peruse (and freely modify) the code, as well as to
1301 enjoy just playing the game.
1303 Now, in practice, there's more to game trees than this: for games with a
1304 larger branching factor than Alak has (which is most!), game trees of
1305 depth four or larger would contain too many nodes to be manageable, most
1306 of those nodes being strategically quite uninteresting for either
1307 player; dealing with game trees specifically is therefore a matter of
1308 recognizing uninteresting contingencies and not bothering to grow the
1314 For example, to choose a straightforward case: if O has a choice between
1315 moves that put him in immediate danger of X winning and moves that
1316 don't, then O won't ever choose the dangerous moves (and if he does, the
1317 computer will know enough to end the game), so there's no point in
1318 growing the tree any further beneath those nodes.
1322 But this sample implementation should illustrate the basics of
1323 how to build and manipulate a simple tree structure in memory.
1324 And once you've understood the basics of tree storage here, you should
1325 be ready to better understand the complexities and peculiarities of
1326 other systems for creating, accessing, and changing trees, including
1327 Tree::DAG_Node, HTML::Element, XML::DOM, or related formalisms
1330 B<[end body of article]>
1332 =head2 [Author Credit]
1334 Sean M. Burke (C<sburke@cpan.org>) is a tree-dwelling hominid.
1338 Dewdney, A[lexander] K[eewatin]. 1984. I<Planiverse: Computer Contact
1339 with a Two-Dimensional World.> Poseidon Press, New York.
1341 Knuth, Donald Ervin. 1997. I<Art of Computer Programming, Volume 1,
1342 Third Edition: Fundamental Algorithms>. Addison-Wesley, Reading, MA.
1344 Wirth, Niklaus. 1976. I<Algorithms + Data Structures = Programs>
1345 Prentice-Hall, Englewood Cliffs, NJ.
1347 Worth, Stan and Allman Sheldon. Circa 1967. I<George of the Jungle>
1348 theme. [music by Jay Ward.]
1350 Wirth's classic, currently and lamentably out of print, has a good
1351 section on trees. I find it clearer than Knuth's (if not quite as
1352 encyclopedic), probably because Wirth's example code is in a
1353 block-structured high-level language (basically Pascal), instead
1354 of in assembler (MIX). I believe the book was re-issued in the
1355 1980s under the titles I<Algorithms and Data Structures> and, in a
1356 German edition, I<Algorithmen und Datenstrukturen>. Cheap copies
1357 of these editions should be available through used book services
1358 such as C<abebooks.com>.
1360 Worth's classic, however, is available on the
1361 soundtrack to the 1997 I<George of the Jungle> movie, as
1362 performed by The Presidents of the United States of America.
1366 Return to the L<HTML::Tree|HTML::Tree> docs.