vcs.maemo.org Git - pkg-perl/blob - deb-src/libhtml-tree-perl/libhtml-tree-perl-3.23/lib/HTML/Tree/AboutTrees.pod

   1
   2 #Time-stamp: "2001-02-23 20:09:47 MST" -*-Text-*-
   3 # This document contains text in Perl "POD" format.
   4 # Use a POD viewer like perldoc or perlman to render it.
   5
   6 =head1 NAME
   7
   8 HTML::Tree::AboutTrees -- article on tree-shaped data structures in Perl
   9
  10 =head1 SYNOPSIS
  11
  12   # This an article, not a module.
  13
  14 =head1 DESCRIPTION
  15
  16 The following article by Sean M. Burke first appeared in I<The Perl
  17 Journal> #18 and is copyright 2000 The Perl Journal. It appears
  18 courtesy of Jon Orwant and The Perl Journal.  This document may be
  19 distributed under the same terms as Perl itself.
  20
  21 =head1 Trees
  22
  23 -- Sean M. Burke
  24
  25 =over
  26
  27 "AaaAAAaauugh!  Watch out for that tree!"
  28   -- I<George of the Jungle theme>
  29
  30 =back
  31
  32 Perl's facility with references, combined with its automatic management of
  33 memory allocation, makes it straightforward to write programs that store data
  34 in structures of arbitrary form and complexity.
  35
  36 But I've noticed that many programmers, especially those who started out
  37 with more restrictive languages, seem at home with complex but uniform
  38 data structures -- N-dimensional arrays, or more struct-like things like
  39 hashes-of-arrays(-of-hashes(-of-hashes), etc.) -- but they're often uneasy
  40 with buliding more freeform, less tabular structures, like
  41 tree-shaped data structures.
  42
  43 But trees are easy to build and manage in Perl, as I'll demonstrate
  44 by showing off how the HTML::Element class manages elements in an HTML
  45 document tree, and by walking you through a from-scratch implementation
  46 of game trees.  But first we need to nail down what we mean by a "tree".
  47
  48 =head2 Socratic Dialogues: "What is a Tree?"
  49
  50 My first brush with tree-shaped structures was in linguistics classes,
  51 where tree diagrams are used to describe the syntax underlying natural
  52 language sentences.  After learning my way around I<those> trees, I
  53 started to wonder -- are what I'm used to calling "trees" the same as what
  54 programmers call "trees"?  So I asked lots of helpful and patient
  55 programmers how they would define a tree.  Many replied with a
  56 answer in jargon that they could not really explain (understandable,
  57 since explaining things, especially defining things, is harder
  58 than people think):
  59
  60 =over
  61
  62 -- So what I<is> a "tree", a tree-shaped data structure?
  63
  64 -- A tree is a special case of an acyclic directed graph!
  65
  66 -- What's a "graph"?
  67
  68 -- Um... lines... and... you draw it... with... arcs! nodes!  um...
  69
  70 =back
  71
  72 The most helpful were folks who couldn't explain directly, but with
  73 whom I could get into a rather Socratic dialog (where I<I> asked the
  74 half-dim half-earnest questions), often with much doodling of
  75 illustrations...
  76
  77 Question: so what's a tree?
  78
  79 Answer: A tree is a collection of nodes that are linked together in a,
  80 well, tree-like way!  Like this I<[drawing on a napkin]:>
  81
  82      A
  83     / \
  84    B   C
  85      / | \
  86     D  E  F
  87
  88 Q: So what do these letters represent?
  89
  90 A: Each is a different node, a bunch of data.  Maybe C is a
  91 bunch of data that stores a number, maybe a hash table, maybe nothing
  92 at all besides the fact that it links to D, E, and F (which are other
  93 nodes).
  94
  95 Q: So what're the lines between the nodes?
  96
  97 A: Links.  Also called "arcs".  They just symbolize the fact that each
  98 node holds a list of nodes it links to.
  99
 100 Q: So what if I draw nodes and links, like this...
 101
 102      B -- E
 103     / \  / \
 104    A   C
 105     \ /
 106      E
 107
 108 Is that still a tree?
 109
 110 A: No, not at all.  There's a lot of un-treelike things about that.
 111 First off, E has a link coming off of it going into nowhere.  You can't have
 112 a link to nothing -- you can only link to another node.  Second off, I
 113 don't know what that sideways link between B and E means...
 114
 115 Q: Okay, let's work our way up from something simpler.  Is this a tree...?
 116
 117     A
 118
 119 A: Yes, I suppose.  It's a tree of just one node.
 120
 121 Q: And how about...
 122
 123    A
 124
 125    B
 126
 127 A: No, you can't just have nodes floating there, unattached.
 128
 129 Q: Okay, I'll link A and B.  How's this?
 130
 131    A
 132    |
 133    B
 134
 135 A: Yup, that's a tree.  There's a node A, and a node B, and they're linked.
 136
 137 Q: How is that tree any different from this one...?
 138
 139    B
 140    |
 141    A
 142
 143 A: Well, in both cases A and B are linked.  But it's in a different
 144 direction.
 145
 146 Q: Direction?  What does the direction mean?
 147
 148 A: Well, it depends what the tree represents.  If it represents a
 149 categorization, like this:
 150
 151           citrus
 152        /    |    \
 153    orange  lemon  kumquat ...
 154
 155 then you mean to say that oranges, lemons, kumquats, etc., are a kind of
 156 citrus.  But if you drew it upside down, you'd be saying, falsely, that
 157 citrus is a kind of kumquat, a kind of lemon, and a kind of orange.
 158 If the tree represented cause-and-effect (or at least what situations
 159 could follow others), or represented what's a part of what, you
 160 wouldn't want to get those backwards, either.  So with the nodes you
 161 draw together on paper, one has to be over the other, so you can tell which
 162 way the relationship in the tree works.
 163
 164 Q:  So are these two trees the same?
 165
 166      A          A
 167     / \        / \
 168    B   C      B   \
 169                    C
 170
 171 A: Yes, although by convention we often try to line up things in the
 172 same generation, like it is in the diagram on the left.
 173
 174 Q: "generation"?  This is a family tree?
 175
 176 A: No, not unless it's a family tree for just yeast cells or something
 177 else that reproduces asexually.
 178 But for sake of having lots of terms to use, we just pretend that links
 179 in the tree represent the "is a child of" relationship, instead of "is a
 180 kind of" or "is a part of", or "could result from", or whatever the real
 181 relationship is.  So we get to borrow a lot of kinship words for
 182 describing trees -- B and C are "children" (or "daughters") of A; A is
 183 the "parent" (or "mother") of B and C.  Node C is a "sibling" (or
 184 "sister") of node C; and so on, with terms like "descedants" (a node's
 185 children, children's children, etc.), and "generation" (all the
 186 nodes at the same "level" in the tree, i.e., are either all
 187 grandchildren of the top node, or all great-grand-children, etc.), and
 188 "lineage" or "ancestors" (parents, and parent's parents, etc., all the
 189 way to the topmost node).
 190
 191 So then we get to express rules in terms like "B<A node cannot have more
 192 than one parent>", which means that this is not a valid tree:
 193
 194     A
 195    / \
 196   B   C
 197    \ /
 198     E
 199
 200 And: "B<A node can't be its own parent>", which excludes this looped-up
 201 connection:
 202
 203     /\
 204    A  |
 205     \/
 206
 207 Or, put more generally: "B<A node can't be its own ancestor>", which
 208 excludes the above loop, as well as the one here:
 209
 210       /\
 211      Z  |
 212     /   |
 213    A    |
 214   / \   |
 215  B   C  |
 216       \/
 217
 218 That tree is excluded because A is a child of Z, and Z is a child of C,
 219 and C is a child of A, which means A is its own great-grandparent.  So
 220 this whole network can't be a tree, because it breaks the sort of
 221 meta-rule: B<once any node in the supposed tree breaks the rules for
 222 trees, you don't have a tree anymore.>
 223
 224 Q: Okay, now, are these two trees the same?
 225
 226      A         A
 227    / | \     / | \
 228   B  C  D   D  C  B
 229
 230 A: It depends whether you're basing your concept of trees on each node
 231 having a set (unordered list) of children, or an (ordered) list of
 232 children.  It's a question of whether ordering is important for what
 233 you're doing.  With my diagram of citrus types, ordering isn't
 234 important, so these tree diagrams express the same thing:
 235
 236           citrus
 237        /    |    \
 238    orange  lemon  kumquat
 239
 240            citrus
 241        /     |    \
 242    kumquat  orange  lemon
 243
 244 because it doesn't make sense to say that oranges are "before" or
 245 "after" kumquats in the whole botanical scheme of things.  (Unless, of
 246 course, you I<are> using ordering to mean something, like a degree of
 247 genetic similarity.)
 248
 249 But consider a tree that's a diagram of what steps are comprised in an
 250 activity, to some degree of specificity:
 251
 252            make tea
 253          /    |     \
 254    pour     infuse   serve
 255  hot water    / \
 256 in cup/pot  /     \
 257            add     let
 258            tea     sit
 259           leaves
 260
 261 This means that making tea consists of putting hot water in a cup or
 262 put, infusing it (which itself consists of adding tea leaves and letting
 263 it sit), then serving it -- I<in that order>.  If you serve an empty
 264 dry pot (sipping from empty cups, etc.), let it sit, add tea leaves,
 265 and pour in hot water, then what you're doing is performance art, not
 266 tea preparation:
 267
 268           perfomance
 269             art
 270         /    |     \
 271    serve   infuse    pour
 272             / \       hot water
 273           /     \      in cup/pot
 274          let     add
 275          sit     tea
 276                 leaves
 277
 278 Except for my having renamed the root, this tree is the same as
 279 the making-tea tree as far as what's under what, but it differs
 280 in order, and what the tree means makes the order important.
 281
 282 Q: Wait -- "root"?  What's a root?
 283
 284 A: Besides kinship terms like "mother" and "daugher", the jargon for
 285 tree parts also has terms from real-life tree parts:  the part that
 286 everything else grows from is called the root; and nodes that don't
 287 have nodes attached to them (i.e., childless nodes) are called
 288 "leaves".
 289
 290 Q: But you've been drawing all your trees with the root at the top and
 291 leaves at the bottom.
 292
 293 A: Yes, but for some reason, that's the way everyone seems to think of
 294 trees.  They can draw trees as above; or they can draw them sort of
 295 sideways with indenting representing what nodes are children of what:
 296
 297   * make tea
 298      * pour hot water in cup/pot
 299      * infuse
 300         * add tea leaves
 301         * let sit
 302      * serve
 303
 304 ...but folks almost never seem to draw trees with the root at the
 305 bottom.  So imagine it's based on spider plant in a hanging pot.
 306 Unfortunately, spider plants I<aren't> botanically trees, they're
 307 plants; but "spider plant diagram" is rather a mouthful, so let's just
 308 call them trees.
 309
 310 =head2 Trees Defined Formally
 311
 312 In time, I digested all these assorted facts about programmers' ideas of
 313 trees (which turned out to be just a more general case of linguistic
 314 ideas of trees) into a single rule:
 315
 316 * A node is an item that contains ("is over", "is parent of", etc.)
 317 zero or more other nodes.
 318
 319 From this you can build up formal definitions for useful terms, like so:
 320
 321 * A node's B<descendants> are defined as all its children, and all
 322 their children, and so on.  Or, stated recursively: a node's
 323 descendants are all its children, and all its children's descendants.
 324 (And if it has no children, it has no descendants.)
 325
 326 * A node's B<ancestors> consist of its parent, and its parent's
 327 parent, etc, up to the root.  Or, recursively: a node's ancestors
 328 consist of its parent and its parent's ancestors.  (If it has no parent,
 329 it has no ancestors.)
 330
 331 * A B<tree> is a root node and all the root's descendants.
 332
 333 And you can add a proviso or two to clarify exactly what I impute to the
 334 word "other" in "other nodes":
 335
 336 * A node cannot contain itself, or contain any node that contains it,
 337 etc.  Looking at it the other way: a node cannot be its own parent or
 338 ancestor.
 339
 340 * A node can be root (i.e., no other node contains it) or can be
 341 contained by only one parent; no node can be the child of two or more
 342 parents.
 343
 344 Add to this the idea that children are sometimes ordered, and sometimes
 345 not, and that's about all you need to know about defining what a tree
 346 is.  From there it's a matter of using them.
 347
 348 =head2 Markup Language Trees: HTML-Tree
 349
 350 While not I<all> markup languages are inherently tree-like, the
 351 best-known family of markup languages, HTML, SGML, and XML, are about
 352 as tree-like as you can get.  In these languages, a document consists
 353 of elements and character data in a tree structure where
 354 there is one root element, and elements can contain either other
 355 elements, or character data.
 356
 357 =over
 358
 359 Footnote:
 360 For sake of simplicity, I'm glossing over
 361 comments (<!-- ... -->), processing instructions (<?xml
 362 version='1.0'>), and declarations (<!ELEMENT ...>, <!DOCTYPE ...>).
 363 And I'm not bothering to distinguish entity references
 364 (&lt;, &#64;) or CDATA sections (<![CDATA[ ...]]>) from normal text.
 365
 366 =back
 367
 368 For example, consider this HTML document:
 369
 370   <html lang="en-US">
 371     <head>
 372       <title>
 373         Blank Document!
 374       </title>
 375     </head>
 376     <body bgcolor="#d010ff">
 377       I've got
 378       <em>
 379         something to saaaaay
 380       </em>
 381       !
 382     </body>
 383   </html>
 384
 385 I've indented this to point out what nodes (elements or text items) are
 386 children of what, with each node on a line of its own.
 387
 388 The HTML::TreeBuilder module (in the CPAN distribution HTML-Tree)
 389 does the work of taking HTML source and
 390 building in memory the tree that the document source represents.
 391
 392 =over
 393
 394 Footnote: it requires the HTML::Parser module, which tokenizes the
 395 source -- i.e., identifies each tag, bit of text, comment, etc.
 396
 397 =back
 398
 399 The trees structures that it builds represent bits of text with
 400 normal Perl scalar string values; but elements are represented with
 401 objects -- that is, chunks of data that belong to a
 402 class (in this case, HTML::Element), a class that provides methods
 403 (routines) for accessing the pieces of data in each element, and
 404 otherwise doing things with elements.  (See my article in TPJ#17 for a
 405 quick explanation of objects, the POD document C<perltoot> for a longer
 406 explanation, or Damian Conway's excellent book I<Object-Oriented Perl>
 407 for the full story.)
 408
 409 Each HTML::Element object contains a number of pieces of data:
 410
 411 * its element name ("html", "h1", etc., accessed as $element->tag)
 412
 413 * a list of elements (or text segments) that it contains, if any
 414 (accessed as $element->content_list or $element->content, depending on
 415 whether you want a list, or an arrayref)
 416
 417 * what element, if any, contains it (accessed as $element->parent)
 418
 419 * and any SGML attributes that the element has,
 420 such as C<lang="en-US">, C<align="center">, etc. (accessed as
 421 $element->attr('lang'), $element->attr('center'), etc.)
 422
 423 So, for example, when HTML::TreeBuilder builds the tree for the above
 424 HTML document source, the object for the "body" element has these pieces of
 425 data:
 426
 427  * element name: "body"
 428  * nodes it contains:
 429     the string "I've got "
 430     the object for the "em" element
 431     the string "!"
 432  * its parent:
 433     the object for the "html" element
 434  * bgcolor: "#d010ff"
 435
 436 Now, once you have this tree of objects, almost anything you'd want to
 437 do with it starts with searching the tree for some bit of information
 438 in some element.
 439
 440 Accessing a piece of information in, say, a hash of hashes of hashes,
 441 is straightforward:
 442
 443   $password{'sean'}{'sburke1'}{'hpux'}
 444
 445 because you know that all data points in that structure are accessible
 446 with that syntax, but with just different keys.  Now, the "em" element
 447 in the above HTML tree does happen to be accessible
 448 as the root's child #1's child #1:
 449
 450   $root->content->[1]->content->[1]
 451
 452 But with trees, you typically don't know the exact location (via
 453 indexes) of the data you're looking for.  Instead, finding what you want
 454 will typically involve searching through the tree, seeing if every node is
 455 the kind you want.  Searching the whole tree is simple enough -- look at
 456 a given node, and if it's not what you want, look at its children, and
 457 so on.  HTML-Tree provides several methods that do this for you, such as
 458 C<find_by_tag_name>, which returns the elements (or the first element, if
 459 called in scalar context) under a given node (typically the root) whose
 460 tag name is whatever you specify.
 461
 462 For example, that "em" node can be found as:
 463
 464   my $that_em = $root->find_by_tag_name('em');
 465
 466 or as:
 467
 468   @ems = $root->find_by_tag_name('em');
 469    # will only have one element for this particular tree
 470
 471 Now, given an HTML document of whatever structure and complexity, if you
 472 wanted to do something like change every
 473
 474 =over
 475
 476 E<lt>emE<gt>I<stuff>E<lt>/emE<gt>
 477
 478 =back
 479
 480 to
 481
 482 =over
 483
 484 E<lt>em class="funky"E<gt>
 485 B<E<lt>bE<gt>[-E<lt>/bE<gt>>
 486 I<stuff>
 487 B<E<lt>bE<gt>-]E<lt>/bE<gt>>
 488 E<lt>/emE<gt>
 489
 490 =back
 491
 492 the first step is to frame this operation in terms of what you're doing
 493 to the tree.  You're changing this:
 494
 495       em
 496        |
 497       ...
 498
 499 to this:
 500
 501       em
 502     /  |  \
 503    b  ...   b
 504    |        |
 505   "[-"     "-]"
 506
 507 In other words, you're finding all elements whose tag name is "em",
 508 setting its class attribute to "funky", and adding one child to the start
 509 of its content list -- a new "b" element
 510 whose content is the text string "[-" -- and one to the end of its
 511 content list -- a new "b" element whose content is the text string "-]".
 512
 513 Once you've got it in these terms, it's just a matter of running to the
 514 HTML::Element documentation, and coding this up with calls to the
 515 appropriate methods, like so:
 516
 517   use HTML::Element 1.53;
 518   use HTML::TreeBuilder 2.96;
 519   # Build the tree by parsing the document
 520   my $root = HTML::TreeBuilder->new;
 521   $root->parse_file('whatever.html'); # source file
 522
 523   # Now make new nodes where needed
 524   foreach my $em ($root->find_by_tag_name('em')) {
 525     $em->attr('class', 'funky'); # Set that attribute
 526
 527     # Make the two new B nodes
 528     my $new1 = HTML::Element->new('b');
 529     my $new2 = HTML::Element->new('b');
 530     # Give them content (they have none at first)
 531     $new1->push_content('[-');
 532     $new2->push_content('-]');
 533
 534     # And put 'em in place!
 535     $em->unshift_content($new1);
 536     $em->push_content($new2);
 537   }
 538   print
 539    "<!-- Looky see what I did! -->\n",
 540    $root->as_HTML(), "\n";
 541
 542 The class HTML::Element provides just about every method I can image you
 543 needing, for manipulating trees made of HTML::Element objects.  (And
 544 what it doesn't directly provide, it will give you the components to build
 545 it with.)
 546
 547 =head2 Building Your Own Trees
 548
 549 Theoretically, any tree is pretty much like any other tree, so you could
 550 use HTML::Element for anything you'd ever want to do with tree-arranged
 551 objects.  However, as its name implies, HTML::Element is basically
 552 I<for> HTML elements; it has lots of features that make sense only for
 553 HTML elements (like the idea that every element must have a tag-name).
 554 And it lacks some features that might be useful for general applications
 555 -- such as any sort of checking to make sure that you're not trying to
 556 arrange objects in a non-treelike way.  For a general-purpose tree class
 557 that does have such features, you can use Tree::DAG_Node, also available
 558 from CPAN.
 559
 560 However, if your task is simple enough, you might find it overkill to
 561 bother using Tree::DAG_Node.  And, in any case, I find that the best
 562 way to learn how something works is to implement it (or something like
 563 it, but simpler) yourself.  So I'll here discuss how you'd implement a tree
 564 structure, I<without> using any of the existing classes for tree nodes.
 565
 566 =head2 Implementation: Game Trees for Alak
 567
 568 Suppose that the task at hand is to write a program that can play
 569 against a human opponent at a strategic board game (as opposed to a
 570 board game where there's an element of chance).  For most such games, a
 571 "game tree" is an essential part of the program (as I will argue,
 572 below), and this will be our test case for implementing a tree
 573 structure from stratch.
 574
 575 For sake of simplicity, our game is not chess or backgammon, but instead
 576 a much simpler game called Alak.  Alak was invented by the mathematician
 577 A. K.  Dewdney, and described in his 1984 book I<Planiverse>. The rules
 578 of Alak are simple:
 579
 580 =over
 581
 582 Footnote: Actually, I'm describing only my
 583 interpetation of the rules Dewdney describes in I<Planiverse>.  Many
 584 other interpretations are possible.
 585
 586 =back
 587
 588 * Alak is a two-player game played on a one-dimensional board with
 589 eleven slots on it.  Each slot can hold at most one piece at a time.
 590 There's two kinds of pieces, which I represent here as "x" and "o" --
 591 x's belong to one player (called X), o's to the other (called O).
 592
 593 * The initial configuration of the board is:
 594
 595    xxxx___oooo
 596
 597 For sake of the article, the slots are numbered from 1 (on the left) to
 598 11 (on the right), and X always has the first move.
 599
 600 * The players take turns moving.  At each turn, each player can move
 601 only one piece, once.  (This unlike checkers, where you move one piece
 602 per move but get to keep moving it if you jump an your opponent's
 603 piece.) A player cannot pass up on his turn.  A player can move any one
 604 of his pieces to the next unoccupied slot to its right or left, which
 605 may involve jumping over occupied slots.  A player cannot move a piece
 606 off the side of the board.
 607
 608 * If a move creates a pattern where the opponent's pieces are
 609 surrounded, on both sides, by two pieces of the mover's color (with no
 610 intervening unoccupied blank slot), then those surrounded pieces are
 611 removed from the board.
 612
 613 * The goal of the game is to remove all of your opponent's pieces, at
 614 which point the game ends.  Removing all-but-one ends the game as
 615 well, since the opponent can't surround you with one piece, and so will
 616 always lose within a few moves anyway.
 617
 618 Consider, then, this rather short game where X starts:
 619
 620   xxxx___oooo
 621     ^         Move 1: X moves from 3 (shown with caret) to 5
 622                (Note that any of X's pieces could move, but
 623                that the only place they could move to is 5.)
 624   xx_xx__oooo
 625           ^   Move 2: O moves from 9 to 7.
 626   xx_xx_oo_oo
 627      ^        Move 3: X moves from 4 to 6.
 628   xx__xxoo_oo
 629            ^  Move 4: O (stupidly) moves from 10 to 9.
 630   xx__xxooo_o
 631       ^       Move 5: X moves from 5 to 10, making the board
 632               "xx___xoooxo".  The three o's that X just
 633               surrounded are removed.
 634   xx___x___xo
 635               O has only one piece, so has lost.
 636
 637 Now, move 4 could have gone quite the other way:
 638
 639   xx__xxoo_oo
 640               Move 4: O moves from 8 to 4, making the board
 641               "xx_oxxo__oo".  The surrounded x's are removed.
 642   xx_o__o__oo
 643   ^           Move 5: X moves from 1 to 2.
 644   _xxo__o__oo
 645         ^     Move 6: O moves from 7 to 6.
 646   _xxo_o___oo
 647    ^          Move 7: X moves from 2 to 5, removing the o at 4.
 648   __x_xo___oo
 649               ...and so on.
 650
 651 To teach a computer program to play Alak (as player X, say), it needs to
 652 be able to look at the configuration of the board, figure out what moves
 653 it can make, and weigh the benefit or costs, immediate or eventual, of
 654 those moves.
 655
 656 So consider the board from just before move 3, and figure all the possible
 657 moves X could make.  X has pieces in slots 1, 2, 4, and 5.  The leftmost
 658 two x's (at 1 and 2) are up against the end of the board, so they
 659 can move only right.  The other two x's (at 4 and 5) can move either
 660 right or left:
 661
 662   Starting board: xx_xx_oo_oo
 663    moving 1 to 3 gives _xxxx_oo_oo
 664    moving 2 to 3 gives x_xxx_oo_oo
 665    moving 4 to 3 gives xxx_x_oo_oo
 666    moving 5 to 3 gives xxxx__oo_oo
 667    moving 4 to 6 gives xx__xxoo_oo
 668    moving 5 to 6 gives xx_x_xoo_oo
 669
 670 For the computer to decide which of these is the best move to make, it
 671 needs to quantify the benefit of these moves as a number -- call that
 672 the "payoff".  The payoff of a move can be figured as just the number
 673 of x pieces removed by the most recent move, minus the nubmer of o
 674 pieces removed by the most recent move.  (It so happens that the rules
 675 of the game mean that no move can delete both o's and x's, but the
 676 formula still applies.)  Since none of these moves removed any pieces,
 677 all these moves have the same immediate payoff: 0.
 678
 679 Now, we could race ahead and write an Alak-playing program that could
 680 use the immediate payoff to decide which is the best move to make.
 681 And when there's more than one best move (as here, where all the moves
 682 are equally good), it could choose randomly between the good
 683 alternatives.  This strategy is simple to implement; but it makes for a
 684 very dumb program.  Consider what O's response to each of the potential
 685 moves (above) could be.  Nothing immediately suggests itself for the
 686 first four possibilities (X having moved something to position 3), but
 687 either of the last two (illustrated below) are pretty perilous,
 688 because in either case O has the obvious option (which he would be
 689 foolish to pass up) of removing x's from the board:
 690
 691    xx_xx_oo_oo
 692       ^        X moves 4 to 6.
 693    xx__xxoo_oo
 694           ^    O moves 8 to 4, giving "xx_oxxo__oo".  The two
 695                surrounded x's are removed.
 696    xx_o__o__oo
 697
 698 or
 699
 700    xx_xx_oo_oo
 701        ^       X moves 5 to 6.
 702    xx_x_xoo_oo
 703           ^    O moves 8 to 5, giving "xx_xoxo__oo".  The one
 704                surrounded x is removed.
 705    xx_xo_o__oo
 706
 707 Both contingencies are quite bad for X -- but this is not captured
 708 by the fact that they start out with X thinking his move will be
 709 harmless, having a payoff of zero.
 710
 711 So what's needed is for X to think I<more> than one step ahead -- to
 712 consider not merely what it can do in this move, and what the payoff
 713 is, but to consider what O might do in response, and the
 714 payoff of those potential moves, and so on with X's possible responses
 715 to those cases could be.  All these possibilities form a game tree -- a
 716 tree where each node is a board, and its children are successors of
 717 that node -- i.e., the boards that could result from every move
 718 possible, given the parent's board.
 719
 720 But how to represent the tree, and how to represent the nodes?
 721
 722 Well, consider that a node holds several pieces of data:
 723
 724 1) the configuration of the board, which, being nice and simple and
 725 one-dimensional, can be stored as just a string, like "xx_xx_oo_oo".
 726
 727 2) whose turn it is, X or O.  (Or: who moved last, from which we can
 728 figure whose turn it is).
 729
 730 3) the successors (child nodes).
 731
 732 4) the immediate payoff of having moved to this board position from its
 733 predecessor (parent node).
 734
 735 5) and what move gets us from our predecessor node to here.  (Granted,
 736 knowing the board configuration before and after the move, it's easy to
 737 figure out the move; but it's easier still to store it as one is
 738 figuring out a node's successors.)
 739
 740 6) whatever else we might want to add later.
 741
 742 These could be stored equally well in an array or in a hash, but it's my
 743 experience that hashes are best for cases where you have more than just
 744 two or three bits of data, or especially when you might need to add new
 745 bits of data.  Moreover, hash key names are mnemonic --
 746 $node->{'last_move_payoff'} is plain as day, whereas it's not so easy having to
 747 remember with an array that $node->[3] is where you decided to keep the
 748 payoff.
 749
 750 =over
 751
 752 Footnote:
 753 Of course, there are ways around that problem: just swear you'll never
 754 use a real numeric index to access data in the array, and instead use
 755 constants with mnemonic names:
 756
 757   use strict;
 758   use constant idx_PAYOFF => 3;
 759   ...
 760   $n->[idx_PAYOFF]
 761
 762 Or use a pseudohash.  But I prefer to keep it simple, and use a hash.
 763
 764 These are, incidentally, the same arguments that
 765 people weigh when trying to decide whether their object-oriented
 766 modules should be based on blessed hashes, blessed arrays, or what.
 767 Essentially the only difference here is that we're not blessing our
 768 nodes or talking in terms of classes and methods.
 769
 770 [end footnote]
 771
 772 =back
 773
 774 So, we might as well represent nodes like so:
 775
 776   $node = { # hashref
 777      'board'          => ...board string, e.g., "xx_x_xoo_oo"
 778
 779      'last_move_payoff' => ...payoff of the move
 780                             that got us here.
 781
 782      'last_move_from' =>  ...the start...
 783      'last_move_to'   =>  ...and end point of the move
 784                               that got us here.  E.g., 5 and 6,
 785                               representing a move from 5 to 6.
 786
 787      'whose_turn'     => ...whose move it then becomes.
 788                            just an 'x' or 'o'.
 789
 790      'successors' => ...the successors
 791   };
 792
 793 Note that we could have a field called something like 'last_move_who' to
 794 denote who last moved, but since turns in Alak always alternate (and
 795 no-one can pass), storing whose move it is now I<and> who last moved is
 796 redundant -- if X last moved, it's O turn now, and vice versa.
 797 I chose to have a 'whose_turn' field instead of a 'last_move_who', but
 798 it doesn't really matter.  Either way, we'll end up inferring one from
 799 the other at several points in the program.
 800
 801 When we want to store the successors of a node, should we use an array
 802 or a hash?  On the one hand, the successors to $node aren't essentially
 803 ordered, so there's no reason to use an array per se; on the other hand,
 804 if we used a hash, with successor nodes as values, we don't have
 805 anything particularly meaningful to use as keys.  (And we can't use the
 806 successors themselves as keys, since the nodes are referred to by
 807 hash references, and you can't use a reference as a hash key.)  Given no
 808 particularly compelling reason to do otherwise, I choose to just use an
 809 array to store all a node's successors, although the order is never
 810 actually used for anything:
 811
 812   $node = {
 813     ...
 814     'successors' => [ ...nodes... ],
 815     ...
 816   };
 817
 818 In any case, now that we've settled on what should be in a node,
 819 let's make a little sample tree out of a few nodes and see what we can
 820 do with it:
 821
 822   # Board just before move 3 in above game
 823   my $n0 = {
 824     'board' => 'xx_xx_oo_oo',
 825     'last_move_payoff' => 0,
 826     'last_move_from' =>  9,
 827     'last_move_to'   =>  7,
 828     'whose_turn' => 'x',
 829     'successors' => [],
 830   };
 831
 832   # And, for now, just two of the successors:
 833
 834   # X moves 4 to 6, giving xx__xxoo_oo
 835   my $n1 = {
 836     'board' => 'xx__xxoo_oo',
 837     'last_move_payoff' => 0,
 838     'last_move_from' =>  4,
 839     'last_move_to'   =>  6,
 840     'whose_turn' => 'o',
 841     'successors' => [],
 842   };
 843
 844   # or X moves 5 to 6, giving xx_x_xoo_oo
 845   my $n2 = {
 846     'board' => 'xx_x_xoo_oo',
 847     'last_move_payoff' => 0,
 848     'last_move_from' =>  5,
 849     'last_move_to'   =>  6,
 850     'whose_turn' => 'o',
 851     'successors' => [],
 852   };
 853
 854   # Now connect them...
 855   push @{$n0->{'successors'}}, $n1, $n2;
 856
 857 =head2 Digression: Links to Parents
 858
 859 In comparing what we store in an Alak game tree node to what
 860 HTML::Element stores in HTML element nodes, you'll note one big
 861 difference: every HTML::Element node contains a link to its parent,
 862 whereas we don't have our Alak nodes keeping a link to theirs.
 863
 864 The reason this can be an important difference is because it can affect
 865 how Perl knows when you're not using pieces of memory anymore.
 866 Consider the tree we just built, above:
 867
 868       node 0
 869      /      \
 870   node 1    node 2
 871
 872 There's two ways Perl knows you're using a piece of memory:
 873 1) it's memory that belongs directly to a variable (i.e., is necessary
 874 to hold that variable's value, or valueI<s> in the case of a hash or
 875 array), or 2) it's a piece of memory that something holds a reference
 876 to.  In the above code, Perl knows that the hash for node 0 (for board
 877 "xx_xx_oo_oo") is in use because something (namely, the variable
 878 C<$n0>) holds a reference to it.  Now, even if you followed the above
 879 code with this:
 880
 881   $n1 = $n2 = 'whatever';
 882
 883 to make your variables C<$n1> and C<$n2> stop holding references to
 884 the hashes for the two successors of node 0, Perl would still know that
 885 those hashes are still in use, because node 0's successors array holds
 886 a reference to those hashes.  And Perl knows that node 0 is still in
 887 use because something still holds a reference to it.  Now, if you
 888 added:
 889
 890   my $root = $n0;
 891
 892 This would change nothing -- there's just be I<two> things holding a
 893 reference to the node 0 hash, which in turn holds a reference to the
 894 node 1 and node 2 hashes.  And if you then added:
 895
 896   $n0 = 'stuff';
 897
 898 still nothing would change, because something (C<$root>) still holds a
 899 reference to the node 0 hash.  But once I<nothing> holds a reference to
 900 the node 0 hash, Perl will know it can destroy that hash (and reclaim
 901 the memory for later use, say), and once it does that, nothing will hold
 902 a reference to the node 1 or the node 2 hashes, and those will be
 903 destroyed too.
 904
 905 But consider if the node 1 and node 2 hashes each had an attribute
 906 "parent" (or "predecessor") that held a reference to node 0.  If your
 907 program stopped holding a reference to the node 0 hash, Perl could
 908 I<not> then say that I<nothing> holds a reference to node 0 -- because
 909 node 1 and node 2 still do.  So, the memory for nodes 0, 1, and 2 would
 910 never get reclaimed (until your program ended, at which point Perl
 911 destroys I<everything>).  If your program grew and discarded lots of
 912 nodes in the game tree, but didn't let Perl know it could reclaim their
 913 memory, your program could grow to use immense amounts of memory --
 914 never a nice thing to have happen.  There's three ways around this:
 915
 916 1) When you're finished with a node, delete the reference each of its
 917 children have to it (in this case, deleting $n1->{'parent'}, say).
 918 When you're finished with a whole tree, just go through the whole tree
 919 erasing links that children have to their children.
 920
 921 2) Reconsider whether you really need to have each node hold a reference
 922 to its parent.  Just not having those links will avoid the whole
 923 problem.
 924
 925 3) use the WeakRef module with Perl 5.6 or later.  This allows you to
 926 "weaken" some references (like the references that node 1 and 2 could
 927 hold to their parent) so that they don't count when Perl goes asking
 928 whether anything holds a reference to a given piece of memory.  This
 929 wonderful new module eliminates the headaches that can often crop up
 930 with either of the two previous methods.
 931
 932 It so happens that our Alak program is simple enough that we don't need
 933 for our nodes to have links to their parents, so the second solution is
 934 fine.  But in a more advanced program, the first or third solutions
 935 might be unavoidable.
 936
 937 =head2 Recursively Printing the Tree
 938
 939 I don't like working blind -- if I have any kind of a complex data
 940 structure in memory for a program I'm working on, the first thing I do
 941 is write something that can dump that structure to the screen so I can
 942 make sure that what I I<think> is in memory really I<is> what's in
 943 memory.  Now, I could just use the "x" pretty-printer command in Perl's
 944 interactive debugger, or I could have the program use the
 945 C<Data::Dumper> module.  But in this case, I think the output from those
 946 is rather too verbose.  Once we have trees with dozens of nodes in them,
 947 we'll really want a dump of the tree to be as concise as possible,
 948 hopefully just one line per node.  What I'd like is something that can
 949 print C<$n0> and its successors (see above) as something like:
 950
 951   xx_xx_oo_oo  (O moved 9 to 7, 0 payoff)
 952     xx__xxoo_oo  (X moved 4 to 6, 0 payoff)
 953     xx_x_xoo_oo  (X moved 5 to 6, 0 payoff)
 954
 955 A subroutine to print a line for a given node, and then do that again for
 956 each successor, would look something like:
 957
 958   sub dump_tree {
 959     my $n = $_[0]; # "n" is for node
 960     print
 961       ...something expressing $n'n content...
 962     foreach my $s (@{$n->{'successors'}}) {
 963       # "s for successor
 964       dump($s);
 965     }
 966   }
 967
 968 And we could just start that out with a call to C<dump_tree($n0)>.
 969
 970 Since this routine...
 971
 972 =over
 973
 974 Footnote:
 975 I first wrote this routine starting out with "sub dump {".  But when
 976 I tried actually calling C<dump($n0)>, Perl would dump core!  Imagine
 977 my shock when I discovered that this is absolutely to be expected --
 978 Perl provides a built-in function called C<dump>, the purpose of which
 979 is to, yes, make Perl dump core.  Calling our routine "dump_tree"
 980 instead of "dump" neatly avoids that problem.
 981
 982 =back
 983
 984 ...does its work (dumping the subtree at and under the
 985 given node) by calling itself, it's B<recursive>.  However, there's a
 986 special term for this kind of recursion across a tree: traversal.  To
 987 B<traverse> a tree means to do something to a node, and to traverse its
 988 children.  There's two prototypical ways to do this, depending on what
 989 happens when:
 990
 991   traversing X in pre-order:
 992     * do something to X
 993     * then traverse X's children
 994
 995   traversing X in post-order:
 996     * traverse X's children
 997     * then do something to X
 998
 999 Dumping the tree to the screen the way we want it happens to be a matter
1000 of pre-order traversal, since the thing we do (print a description of
1001 the node) happens before we recurse into the successors.
1002
1003 When we try writing the C<print> statement for our above C<dump_tree>,
1004 we can get something like:
1005
1006   sub dump_tree {
1007     my $n = $_[0];
1008
1009     # "xx_xx_oo_oo  (O moved 9 to 7, 0 payoff)"
1010     print
1011       $n->{'board'}, "  (",
1012       ($n->{'whose_turn'} eq 'o' ? 'X' : 'O'),
1013       # Infer who last moved from whose turn it is now.
1014       " moved ", $n->{'last_move_from'},
1015       " to ",    $n->{'last_move_to'},
1016       ", ",      $n->{'last_move_payoff'},
1017       " payoff)\n",
1018     ;
1019
1020     foreach my $s (@{$n->{'successors'}}) {
1021       dump_tree($s);
1022     }
1023   }
1024
1025 If we run this on $n0 from above, we get this:
1026
1027   xx_xx_oo_oo  (O moved 9 to 7, 0 payoff)
1028   xx__xxoo_oo  (X moved 4 to 6, 0 payoff)
1029   xx_x_xoo_oo  (X moved 5 to 6, 0 payoff)
1030
1031 Each line on its own is fine, but we forget to allow for indenting, and
1032 without that we can't tell what's a child of what.  (Imagine if the
1033 first successor had successors of its own -- you wouldn't be able to
1034 tell if it were a child, or a sibling.)  To get indenting, we'll need
1035 to have the instances of the C<dump_tree> routine know how far down in
1036 the tree they're being called, by passing a depth parameter between
1037 them:
1038
1039   sub dump_tree {
1040     my $n = $_[0];
1041     my $depth = $_[1];
1042     $depth = 0 unless defined $depth;
1043     print
1044       "  " x $depth,
1045       ...stuff...
1046     foreach my $s (@{$n->{'successors'}}) {
1047       dump_tree($s, $depth + 1);
1048     }
1049   }
1050
1051 When we call C<dump_tree($n0)>, C<$depth> (from C<$_[1]>) is undefined, so
1052 gets set to 0, which translates into an indenting of no spaces.  But when
1053 C<dump_tree> invokes itself on C<$n0>'s children, those instances see
1054 C<$depth> + 1 as their C<$_[1]>, giving appropriate indenting.
1055
1056 =over
1057
1058 Footnote:
1059 Passing values around between different invocations of a recursive
1060 routine, as shown, is a decent way to share the data.  Another way
1061 to share the data is by keeping it in a global variable, like C<$Depth>,
1062 initially set to 0.  Each time C<dump_tree> is about to recurse, it must
1063 C<++$Depth>, and when it's back, it must C<--$Depth>.
1064
1065 Or, if the reader is familiar with closures, consider this approach:
1066
1067   sub dump_tree {
1068     # A wrapper around calls to a recursive closure:
1069     my $start_node = $_[0];
1070     my $depth = 0;
1071      # to be shared across calls to $recursor.
1072     my $recursor;
1073     $recursor = sub {
1074       my $n = $_[0];
1075       print "  " x $depth,
1076         ...stuff...
1077       ++$depth;
1078       foreach my $s (@{$n->{'successors'}}) {
1079         $recursor->($s);
1080       }
1081       --$depth;
1082     }
1083     $recursor->($start_node); # start recursing
1084     undef $recursor;
1085   }
1086
1087 The reader with an advanced understanding of Perl's reference-count-based
1088 garbage collection is invited to consider why it is currently necessary
1089 to undef $recursor (or otherwise change its value) after all recursion
1090 is done.
1091
1092 The reader whose mind is perverse in other ways is invited to consider
1093 how (or when!) passing a depth parameter around is unnecessary because
1094 of information that Perl's C<caller(N)> function reports!
1095
1096 [end footnote]
1097
1098 =back
1099
1100 =head2 Growing the Tree
1101
1102 Our C<dump_tree> routine works fine for the sample tree we've got, so
1103 now we should get the program working on making its own trees, starting
1104 from a given board.
1105
1106 In C<Games::Alak> (the CPAN-released version of Alak that uses
1107 essentially the same code that we're currently discussing the
1108 tree-related parts of), there is a routine called C<figure_successors>
1109 that, given one childless node, will figure out all its possible
1110 successors.  That is, it looks at the current board, looks at every piece
1111 belonging to the player whose turn it is, and considers the effect of
1112 moving each piece every possible way -- notably, it figures out the
1113 immediate payoff, and if that move would end the game, it notes that by
1114 setting an "endgame" entry in that node's hash.  (That way, we know that
1115 that's a node that I<can't> have successors.)
1116
1117 In the code for C<Games::Alak>, C<figure_successors> does all these things,
1118 in a rather straightforward way.  I won't walk you through the details
1119 of the C<figure_successors> code I've written, since the code has
1120 nothing much to do with trees, and is all just implementation of the Alak
1121 rules for what can move where, with what result.  Espicially interested
1122 readers can puzzle over that part of code in the source listing in the
1123 archive from CPAN, but others can just assume that it works as described
1124 above.
1125
1126 But consider that C<figure_successors>, regardless of its inner
1127 workings, does not grow the I<tree>; it only makes one set of successors
1128 for one node at a time.  It has to be up to a different routine to call
1129 C<figure_successors>, and to keep applying it as needed, in order to
1130 make a nice big tree that our game-playing program can base its
1131 decisions on.
1132
1133 Now, we could do this by just starting from one node, applying
1134 C<figure_successors> to it, then applying C<figure_successors> on all
1135 the resulting children, and so on:
1136
1137   sub grow {  # Just a first attempt at this!
1138     my $n = $_[0];
1139     figure_successors($n);
1140      unless
1141       @{$n->{'successors'}}
1142         # already has successors.
1143       or $n->{'endgame'}
1144         # can't have successors.
1145     }
1146     foreach my $s (@{$n->{'successors'}}) {
1147       grow($s); # recurse
1148     }
1149   }
1150
1151 If you have a game tree for tic-tac-toe, and you grow it without
1152 limitation (as above), you will soon enough have a fully "solved" tree,
1153 where every node that I<can> have successors I<does>, and all the leaves
1154 of the tree are I<all> the possible endgames (where, in each case, the
1155 board is filled).  But a game of Alak is different from tic-tac-toe,
1156 because it can, in theory, go on forever.  For example, the following
1157 sequence of moves is quite possible:
1158
1159   xxxx___oooo
1160   xxx_x__oooo
1161   xxx_x_o_ooo
1162   xxxx__o_ooo (x moved back)
1163   xxxx___oooo (o moved back)
1164   ...repeat forever...
1165
1166 So if you tried using our above attempt at a C<grow> routine, Perl would
1167 happily start trying to construct an infinitely deep tree, containing
1168 an infinite number of nodes, consuming an infinite amount of memory, and
1169 requiring an infinite amount of time.  As the old saying goes: "You
1170 can't have everything -- where would you put it?"  So we have to place
1171 limits on how much we'll grow the tree.
1172
1173 There's more than one way to do this:
1174
1175 1. We could grow the tree until we hit some limit on the number of
1176 nodes we'll allow in the tree.
1177
1178 2. We could grow the tree until we hit some limit on the amount of time
1179 we're willing to spend.
1180
1181 3. Or we could grow the tree until it is fully fleshed out to a certain
1182 depth.
1183
1184 Since we already know to track depth (as we did in writing C<dump_tree>),
1185 we'll do it that way, the third way.  The implementation for that third
1186 approach is also pretty straightforward:
1187
1188   $Max_depth = 3;
1189   sub grow {
1190     my $n = $_[0];
1191     my $depth = $_[1] || 0;
1192     figure_successors($n)
1193      unless
1194       $depth >= $Max_depth
1195       or @{$n->{'successors'}}
1196       or $n->{'endgame'}
1197     }
1198     foreach my $s (@{$n->{'successors'}}) {
1199       grow($s, $depth + 1);
1200     }
1201     # If we're at $Max_depth, then figure_successors
1202     #  didn't get called, so there's no successors
1203     #  to recurse under -- that's what stops recursion.
1204   }
1205
1206 If we start from a single node (whether it's a node for the starting board
1207 "xxxx___oooo", or for whatever board the computer is faced with), set
1208 C<$Max_depth> to 4, and apply C<grow> to it, it will grow the tree to
1209 include several hundred nodes.
1210
1211 =over
1212
1213 Footnote:
1214 If at each move there are four pieces that can move, and they can each
1215 move right or left, the "branching factor" of the tree is eight, giving
1216 a tree with 1 (depth 0) + 8 (depth 1) + 8 ** 2 + 8 ** 3 + 8 ** 4  =
1217 4681 nodes in it.  But, in practice, not all pieces can move in both
1218 directions (none of the x pieces in "xxxx___oooo" can move left, for
1219 example), and there may be fewer than four pieces, if some were lost.
1220 For example, there are 801 nodes in a tree of depth four starting
1221 from "xxxx___oooo", suggesting an average branching factor of about
1222 five (801 ** (1/4) is about 5.3), not eight.
1223
1224 =back
1225
1226 What we need to derive from that tree is the information about what
1227 are the best moves for X.  The simplest way to consider the payoff of
1228 different successors is to just average them -- but what we average
1229 isn't always their immediate payoffs (because that'd leave us using
1230 only one generation of information), but the average payoff of I<their>
1231 successors, if any.  We can formalize this as:
1232
1233   To figure a node's average payoff:
1234     If the node has successors:
1235       Figure each successor's average payoff.
1236       My average payoff is the average of theirs.
1237     Otherwise:
1238       My average payoff is my immediate payoff.
1239
1240 Since this involves recursing into the successors I<before> doing
1241 anything with the current node, this will traverse the tree
1242 I<in post-order>.
1243
1244 We could work that up as a routine of its own, and apply that to the
1245 tree after we've applied C<grow> to it.  But since we'd never
1246 grow the tree without also figuring the average benefit, we might as well
1247 make that figuring part of the C<grow> routine itself:
1248
1249   $Max_depth = 3;
1250   sub grow {
1251     my $n = $_[0];
1252     my $depth = $_[1] || 0;
1253     figure_successors($n);
1254      unless
1255       $depth >= $Max_depth
1256       or @{$n->{'successors'}}
1257       or $n->{'endgame'}
1258     }
1259
1260     if(@{$n->{'successors'}}) {
1261       my $a_payoff_sum = 0;
1262       foreach my $s (@{$n->{'successors'}}) {
1263         grow($s, $depth + 1);  # RECURSE
1264         $a_payoff_sum += $s->{'average_payoff'};
1265       }
1266       $n->{'average_payoff'}
1267        = $a_payoff_sum / @{$n->{'successors'}};
1268     } else {
1269       $n->{'average_payoff'}
1270        = $n->{'last_move_payoff'};
1271     }
1272   }
1273
1274 So, by time C<grow> has applied to a node (wherever in the tree it is),
1275 it will have figured successors if possible (which, in turn, sets
1276 C<last_move_payoff> for each node it creates), and will have set
1277 C<average_benefit>.
1278
1279 Beyond this, all that's needed is to start the board out with a root
1280 note of "xxxx___oooo", and have the computer (X) take turns with the
1281 user (O) until someone wins.  Whenever it's O's turn, C<Games::Alak>
1282 presents a prompt to the user, letting him know the state of the current
1283 board, and asking what move he selects.  When it's X's turn, the
1284 computer grows the game tree as necessary (using just the C<grow>
1285 routine from above), then selects the move with the highest average
1286 payoff (or one of the highest, in case of a tie).
1287
1288 In either case, "selecting" a move means just setting that move's node
1289 as the new root of the program's game tree.  Its sibling nodes and their
1290 descendants (the boards that I<didn't> get selected) and its parent node
1291 will be erased from memory, since they will no longer be in use (as Perl
1292 can tell by the fact that nothing holds references to them anymore).
1293
1294 The interface code in C<Games::Alak> (the code that prompts the user for
1295 his move) actually supports quite a few options besides just moving --
1296 including dumping the game tree to a specified depth (using a slightly
1297 fancier version of C<dump_tree>, above), resetting the game, changing
1298 C<$Max_depth> in the middle of the game, and quitting the game.  Like
1299 C<figure_successors>, it's a bit too long to print here, but interested
1300 users are welcome to peruse (and freely modify) the code, as well as to
1301 enjoy just playing the game.
1302
1303 Now, in practice, there's more to game trees than this: for games with a
1304 larger branching factor than Alak has (which is most!), game trees of
1305 depth four or larger would contain too many nodes to be manageable, most
1306 of those nodes being strategically quite uninteresting for either
1307 player; dealing with game trees specifically is therefore a matter of
1308 recognizing uninteresting contingencies and not bothering to grow the
1309 tree under them.
1310
1311 =over
1312
1313 Footnote:
1314 For example, to choose a straightforward case: if O has a choice between
1315 moves that put him in immediate danger of X winning and moves that
1316 don't, then O won't ever choose the dangerous moves (and if he does, the
1317 computer will know enough to end the game), so there's no point in
1318 growing the tree any further beneath those nodes.
1319
1320 =back
1321
1322 But this sample implementation should illustrate the basics of
1323 how to build and manipulate a simple tree structure in memory.
1324 And once you've understood the basics of tree storage here, you should
1325 be ready to better understand the complexities and peculiarities of
1326 other systems for creating, accessing, and changing trees, including
1327 Tree::DAG_Node, HTML::Element, XML::DOM, or related formalisms
1328 like XPath and XSL.
1329
1330 B<[end body of article]>
1331
1332 =head2 [Author Credit]
1333
1334 Sean M. Burke (C<sburke@cpan.org>) is a tree-dwelling hominid.
1335
1336 =head2 References
1337
1338 Dewdney, A[lexander] K[eewatin].  1984.  I<Planiverse: Computer Contact
1339 with a Two-Dimensional World.>  Poseidon Press, New York.
1340
1341 Knuth, Donald Ervin.  1997.  I<Art of Computer Programming, Volume 1,
1342 Third Edition: Fundamental Algorithms>.  Addison-Wesley,  Reading, MA.
1343
1344 Wirth, Niklaus.  1976.  I<Algorithms + Data Structures = Programs>
1345 Prentice-Hall, Englewood Cliffs, NJ.
1346
1347 Worth, Stan and Allman Sheldon.  Circa 1967.  I<George of the Jungle>
1348 theme.  [music by Jay Ward.]
1349
1350 Wirth's classic, currently and lamentably out of print, has a good
1351 section on trees.  I find it clearer than Knuth's (if not quite as
1352 encyclopedic), probably because Wirth's example code is in a
1353 block-structured high-level language (basically Pascal), instead
1354 of in assembler (MIX).  I believe the book was re-issued in the
1355 1980s under the titles I<Algorithms and Data Structures> and, in a
1356 German edition, I<Algorithmen und Datenstrukturen>.  Cheap copies
1357 of these editions should be available through used book services
1358 such as C<abebooks.com>.
1359
1360 Worth's classic, however, is available on the
1361 soundtrack to the 1997 I<George of the Jungle> movie, as
1362 performed by The Presidents of the United States of America.
1363
1364 =head1 BACK
1365
1366 Return to the L<HTML::Tree|HTML::Tree> docs.
1367
1368 =cut
1369