0:04

Today we're going to talk about trees.

This is the first second half of the course is

the first of a series of lectures on applications.

Just a little bit in the way of orientation before we begin for the first

half of the class, the first thing we did was introduce analysis of algorithms.

But then we have spent the rest of the time

surveying the basic methods that we need in terms of mathematics

in order to do scientific studies about the performance of algorithms.

And then we finished off last time by introducing analytic combinatorics.

So this is all about mathematics and about the techniques.

0:43

Now, we're going to switch to the second half of the class.

We're going to switch to applications.

And we're going to talk about some basic classical commenatorial

classes with lots of applications.

And we're going to look at

1:07

And this will bring us to a really basic combinatorial class with all kinds

of applications.

Trees, permutations, strings and triads, words and mappings.

And we'll be using labeled and unlabeled classes.

They're alternating with ordinary generating functions and

exponential generating functions.

Really applying all the mathematical techniques

that we learned in the first half of the class.

So today, were going to begin with trees.

And we'll start off by talking about basic definitions of trees and forests.

To begin I'll review what we've talked about in terms of binary trees,

2:16

The external nodes are signified with little boxes and

the internal nodes are signified with circles.

And we talked about binary tree, we've talked about those before.

So now we refer to the level or the depth of a node in the tree.

Starting with the root at depth zero,

the children of the root at depth one, and so on.

3:01

And we did a derivation with the symbolic method that I will just rush through

because we have done it so many times, but just to get back on the notation.

So we have a generating function, which is the sum over all objects

is s to the size of that object, which collects together objects by their size.

We have atoms that are internal nodes and external nodes.

And then depending on how we set the size of those atoms,

that's what we're going to count.

In this case, we're counting internal nodes.

And then we have a combinatorial construction that comes as a mathematical

representation of our explicit definition of what we mean by a binary tree.

And then, our basic transfer theorem translates that

construction immediately into an equation on the generating function.

Which then we can solve and get asymptotic estimates of the coefficients.

So that's just a quick review of the symbolic method for binary trees just to

set the context for different problems that we're going to do in this lecture.

4:07

So now in general, the classical mathematical concept of

a tree is In terms of forests.

So a forest is a set of trees and a tree is a forest with a root added to the top.

And there's actually, just from this quick observation,

the generating function that enumerates forest is just 1 over Z or

the generating function, that enumerates trees,

is just Z times the generating function, that enumerates forest.

By corresponding to adding the root.

So, this is the recursive definition, but these are recursive structures.

A forest is a sequence of destroyed trees.

A tree is a node, called the root.

And it should say, forest is empty or secret to destroying trees.

Trees of node, called the root, connect into the roots of trees, in the forest.

And so, now a route can have any number of children, including zero.

In the leaf nodes of the nodes with zero children.

5:07

And then the level is the same as before,

we start with the root at depth 0,

the children of the root at depth 1, and so forth.

And the height is, again, the deepest node in the tree.

5:40

So these are all the forests with one, two, three, and four nodes.

And immediately, right away, you can recognize

6:06

another node in either order.

Or the root node connected to two children or the straight

line which is a root connecting to a child connecting to a child.

Notice the order of significance, a forest is a sequence of trees and

that's consistent with computer applications where we have to represent

the thing in the computer some way, we pick a sequence usually.

6:32

So anyway, the catalan number are here and of course

a tree is root connected to a forest, so if you attach roots to all of those,

you get all possible trees and it's the same numbers with a shift over by one.

6:55

So, and

we just follow through the basic steps that we've outlined many times before.

And we're going to be doing this a lot.

But a small mistake leads to the completely wrong answer.

So it's good to be careful.

So f is the class of all forests.

And the size is going to be the number of nodes, where our atoms are now,

there is just one kind of node.

And it's got generating function Z.

And g is the class of all trees on the same atom, and again,

the same size function.

So those are two combinatorial classes and now we can use our definitions

of those classes to develop constructions directly from the definitions.

What do we say?

We say a forest is a sequence of trees and a tree is a root

node in a forest and that immediately corresponds to those two constructions.

And then the transfer theorems immediately give us the ogf's for

those two commonotorial classes.

F(z) = 1 / 1-G(z) and G(z)=zF(z) that's what I referred to on the previous slide.

So now if we just solve that we get F(z) substitute in one minus Z F of Z for

G of Z and the solve for F, you get F of Z minus Z, F of Z squared equals one.

That's exactly identical to the catalan generating function.

So it means that the number for

a forest with N node is exactly equal to the number of trees with N nodes,

which is the catalan numbers, which we know the asymptotics [INAUDIBLE] For.

And the number of trees within nodes,

the number of forests with N minus 1 nodes, so it's gotta factor for less.

8:41

So that's a symbolic method that shows that the number of forests and

the number of trees is exactly the same.

Now from an analytic point of view, we're happy to get the result.

But usually,

in common when you find that you have two classes that enumerate exactly the same.

What you want to find is a bijection showing that

a correspondence between every member of one class and every member of the other.

And in this case that bijection is important because

it gives us a way to represent forests and trees in the computer conveniently.

9:32

The way the correspondence goes is that, for

every node, we connect it to its left child and its right sibling.

So the node at the left of the forest corresponds to the top node

in the binary tree, and so forth.

That's a one to one correspondence between forests and binary trees.

Another way to look at it that's even maybe easier to see, called a rotation

correspondence If you take this binary tree [COUGH] and connect it up as if it

were the nodes of the forest, you can see that you have the forest kind of rotated.

10:09

And that's a very useful and

easy way to represent forests and general trees within a computer.

Because a binary tree is easy to represent in chunks of memory where

every node has whatever information's associated and its two children.

Whereas in a forest you have to deal with a variable number of children per node

which can be inconvenient or difficult in some computing environments.

10:38

Just as an aside, at this point it's worth thinking

about is the idea of an algorithm for drawing binary tress.

And I want to show that because I'm going to be showing a lot of binary trees.

And the exercise of writing a program for

drawing binary tress is a worthwhile exercise for everyone to do.

11:01

So the most natural approach that we use very often,

particularly when talking about sorting and searching algorithms,

is to, well first of all, the y coordinate is easy.

That's just, well it's the depth, but since they go down,

it's the height minus the depth.

So we start with the root at the highest and then just subtract one.

So we know the y-coordinate of every node.

The x-coordinate of the node, the easiest way to

set that up is to just do a recursive inorder traversal of the tree.

And just assign x coordinates every time you reach a note.

And that's corresponding to say the recursive

programming says what's a coordinate of a root?

It's the number of nodes on the left plus 1 and

then the x coordinate of everything on the right side is bigger than that.

Then you can see immediately that number one is easy to assign the coordinates,

just do an in order a traversal of the tree and

number two it's basis the nodes on the tree out nicely.

Usually when we draw big trees, we leave big binary trees,

we leave out the external nodes.

Now, there's a problem with this, is that you get the distracting long edges for

some kinds of trees.

Particularly, for say binary trees will represent general trees,

and You can use a similar algorithm like this for general trees, by the way.

But anyway, that's a problem.

So what we do sometimes is, take the x coordinate and

at every level, we just evenly space the nodes.

If there's four nodes at that level, we evenly space them and

center them on the root node.

12:55

That's a useful way to draw trees, because it gives a profile of

what the trees, how thick the trees are, how many nodes there are at each level and

that's an interesting thing to know about in some kinds of analysis

13:18

And actually,

when you see random binary trees, they have many many different shapes.

And if we were to do random trees or random forests as well, you see quite

a collection of Different and interesting shapes.

And so the challenge that we have, that we're going to go into, and

the reason that I wanted to draw these big, binary trees,

is to make clear what that challenge is.

So we have to analytically characterize this in some way.

I may be averaging over all of these or whatever it is that we're doing in terms

of the analysis, it's going to have to explain this behavior.

And of remarkably we were able to get very far in doing that and so that's what we'll

start doing now, that just a brief introduction about trees and forests.