0:50

This time we're going to talk about that same object, but we're going to look at

it as a word. So it's a sequence of M label sets that

have N objects in total. There are also M to the N words, and so

it's a sequence of sets. Now these are really two different ways

of looking at the same object. Because if you take the string and you

take every occurrence of the letter I as being the Ith set in the sequence.

So for example one in this string appears in position seven and that's the only

place it appears so that corresponds to the set seven Two appears in one, three,

and eight. So that corresponds to the set, one,

three, and eight. there's no three.

four appears in position two and four and five in positions five, six, and nine.

You can see a direct correspondence between strings and sequences of sets.

We refer the sequences of sets as words. so that's the correspondence again just

what I just described. so what's the difference between strings

and words? there's no difference, it's only the

point of view. last time when we talked about strings,

we were concerned about sequences of characters in relation to one another in

the sequence, like how many sequences of zeros in a row are there or is does a

pattern occur. for this lecture in words, we're more

concerned with the sets of indices and not so much in the the sequence, and

there's applications where that point of view it the one that, that we want to

push. so another way to look at the same thing

is the, the balls and urn model, which is classical in, in commonotoric.

so if we have n labeled balls and we throw them into m urns you can think of

when ball i, goes into urn j, that means j character in the string is the i

character in this string is j. Hm, so when five goes into earn one, and

we just write a one in the string. so this balls in, and, and we don't,

balls and earn sequences is simply corresponds to strings.

and we don't care about the the order in which the balls are in the urns, they're

sets of indices. Though balls and urns sequences are

equivalent to both strings of words and so, sometimes, we use that model, for

describing these objects. Its more like words in terms of its a

sequence of sets of objects. so how many words are there?

or if you throw in Bolst and Merns, how many configurations are there?

Well, you can tell by the language that I'm using that this is going to be very

simply described with the symbolic method.

so we'll call the combinatorial class a w sub n since we have a sequence on m urns.

so and then the size function is just the total number objects.

and so the generating function is straight forward from the definition.

and so we've got all those examples that I just talked about.

But the key thing is that the combinatorial construction says that a

word on M letters is a sequence of M urns the sequence of sets.

and so the generating function, the, the set of objects is e to the z, sequence of

size M is that to the nth power. So the generating function is e to the m

z. Coefficient of z to the n in that in

times n factorial is m to the n. So that's the basic construction and

counting, for words. and again using the standard paradigm, we

can build of this same construction. to study variations, and there's many and

important applications of variations. but again, just to summarize, a string

is, we think of as unlabeled objects that are just a sequence.

we use OGF and our enumeration is, it's a sequence of M different unlabeled

objects. So the generating functions, 1 over 1

minus e and the end, that's what we talked about last time.

And we use that to study algorithms like string searching algorithms where as a

word is labeled objects, it's exponential-generating functions.

It's sets of objects, but you can also write it as a string the generating, the

construction is a sequence of sets of objects, size M.

and we get the same count, and we use that to study hashing algorithms, and

more generally, occupancy problems in combinatorics.

but we can do variations. again, just as for permutations, to get

results for problems and applications problems using the, the same symbolic

method that we use for the basic problem. And these are two very famous variations,

and again, if you want to see many more details about these, go to part one of

the course. So a birthday sequence is a word where no

letter appears twice. we throw the balls in the urns and we

never get two balls in an urn. a coupon collector sequence is a word

where every letter appears at least once. so birthday sequence is about if you have

a room with n people in it. And you want, and there's 365 different

birthdays. what's the chance that no two people have

the same birthday? coupon collector sequence is how many

things do you have to see before you get everything in the collection?

And there's many, many applications of both of these.

And they're classic in commonotorics. So let's look at a simple modification of

that basic construction to study the problem of how many birthday sequences

are there. so that's our words where no set has more

than one element and, and again, it's easy to define the class and, and the

generating functions, and the construction is simply a birthday

sequence with alpha to M letters, is a sequence, of length M, of either empty

urns or urns with one ball in them. So that means the EGF of birthday

sequences, is just one plus Z to the N. And a counting sequence is, coefficient

of N factorial. And N factorial times coefficient is Z to

the N in that, which is m times m plus, minus 1 and so forth.

And then if you want to get the probability that a random words of

birthday sequence, you divide by that m to the n and so forth.

So that's again, just a simple example of modifying the basic construction to be

able to quickly get a generating function for such sequences.

Or, a, the other, another example is coupon collectors, at the other end of

the spectrum. So, that's a string that uses all the

letters in the alphabet. So, it's a sequence of sets.

None of them are empty, so by now you can practically write down this construction

yourself. A birthday sequence is a sequence of

length and of set, that are all of 5 bigger than 0.

The transfer theorem immediately gives a generating function equation, so a set of

size, greater than zero, is E to the Z minus one.

Subtract off the first term the set of zero and then sequence of m of those is

either the z minus 1 to the n. So that's a generating function that

describes the the, for the coupon collector sequences, exponential

generating function. and so here's some other related

combinatorial classes. so another name for coupon collector

sequence is an m surject, surjection. That's an m word with no empty set.

if we take off the constraint on m and you just say it's a word that's an m

sur-, surjection for some m So that is, there's some value of N that

uh,[COUGH] it's a sequence of sets, none of which are empty, for some size of the

sequence. So these are examples of surjections.

so how many sur, surjections are there of length N?

So this one's all 1s. This one's all 1s and 2s.

and then the rest of them are 1s, 2s, and 3s.

but from analytic combinatorics, it's easy to write down the construction for

this. for M-surjections, we had sequence of

length M and we got that generating function.

For surjections in general we take out the constraint on m so that means a seq-

the set greater than 0 is e to z minus 1. Sequence of those is 1 over 1 minus that.

So it's 1 over 2 minus e to the z. That's the generating function for

surjections. Now to extract an estimate for the

coefficient of z to the M and N. Again it's best done with complex

asymptotics that we'll talk about in the second part of this course.

Turns out to be about N factorial over 2 log 2 of the n plus 1.

Again, a very simple modification to the basic construction but it leads us to a

generating function equation that we can later analyze.

And by the way, the analysis at the other end is going to have come from similar

general transfer theorems. so, this is the, types of variations that

we, see on words. So, we start with the basic construction,

and, so maybe we generalize it to talk about coupon collectors, where we insist

that all the letter counts are bigger than a certain amount.

or my birthday where there, all the letter counts are less than a certain

amount. and these things are relevant in the

analysis of of computer algorithms and many other applications.

or you can do the constraint really in any way that, you know, you want the

occupancies to all be odd or all be even or prime numbers, whatever you want.

leads to a generating function equation. and then we talked about surjections,

though again starting with a basic thing we get a galaxy of related constructions,

that the constructions are quite simple and they lead to generating functions of

all different character. and that's a introduction to the analytic

commonatorics of words and strings, and how to use the symbolic method to get

exponential generating functions for a variety of commonatorial classes.