0:22

In the last lecture, we went through and tried to count

the number of operations in the various pieces of your program

and derive some expression that gave us, some intuition about the

running time of your program based on the size of its inputs.

0:35

In this lecture, we're going to take a different approach.

We're going to think about instrumenting your code.

To kind of, count the number of operations.

And then we're going to plot the size of the input, versus the running time.

0:47

Plotting this data will give us a, give us a

nice way to kind of analyze the behavior of that data.

Turns out that humans are visual creatures, so looking

at a plot of a data is actually much

more effective than looking at, say, some raw dump

of the data or some kind of raw table.

I'll, I'll give you an example.

1:04

Here's a, actually a well known image.

It's Scott's level of agitation as a function of the

number of global variables that you use in your code.

No, that's not really true.

It's actually a plot of kind of global temperature over the last 125 years.

This particular picture has inspired billions of words of discussion.

It's a good example of kind of the power of using visualization to understand data.

1:30

So what we're going to do, is we're

going to start off, we'll take some mystery code.

We'll instrument it, we'll generate some data, then we'll figure

out how to plot it using a package in code sculptor.

And then we'll do some quick analysis of that data

to try to understand the running time of this mystery function.

1:52

So if you look on the screen here, I have three functions that I've defined.

Mystery 1, mystery 2, mystery 3.

I've used code folding to hide the body of these functions.

When you load them in, you can either look at the

body or you can fold it up and hide it yourself.

2:07

But the problem that we're going to try to

attack here is to understand the running time, kind

of the length of the execution trace for each

of these mystery functions as a function of input_val.

So as I change input_val, each of these functions takes longer to execute.

We kind of like to know, kind of what's

the relationship between input value and the execution time.

So to do that, what I've done is I've built a

global counter, I'm sorry Scott, and I've essentially inserted in the

execution of each mystery function a statement that increments the value

of this counter each time we do whatever we want to analyze.

In this case, we have a collection of loops.

So every time we execute the body of

the innermost loop, we increment the counter by one.

2:48

So, then I built this plotting function down here that I can

pass in a plot size, kind of the range for the input values.

The function we're going to plot, mystery 1, mystery 2 and mystery

3, and then the plot type, we'll talk about in a second.

So then, here's what happens.

It's very simple, we're just start out with the plot that's an

empty list, and we choose var, various values for the imput size.

We run the plotting function, and then we grab the value again, put val, grab

the imput value in the counter, and append it to, at the end of the plot.

So, the result of this is a list

of points, in a kind of two dimensional space.

Their input value versus the counter size.

3:22

So, what we like to do now is actually

figure out how to make sense of that data set.

So, we're going to talk about plotting [SOUND].

Okay, let's talk about how to plot data.

3:35

So the basic idea is, we created this list of points.

And we'd now like to make a picture of it.

So, in Python, the standard method of doing this is, choose a plotting package.

3:45

Now, we're at Python.org here.

There's literally, if you look as this, there's

dozens of various packages available for doing plotting.

3:54

This is a little bit like what we faced when we used Simple

GUI versus TK Enter or Pie Game or WX Python would be my guess.

There's lots of different packages for doing GUIs.

There's lots of different packages for doing plotting.

If your working in idle and you want to stay

basically, slow down the desktop, pick one of these packages.

Read over it.

Figure out how to use it.

It shouldn't be too hard.

And you can plot your data.

4:17

We've provided a custom plotting package inside Code Sculpture.

It's extremely simple.

It's called simple plot.

It basically allows you to make line plots.

It's there for your convenience.

I'm going to use it in the example here because it's

easy to do and works very nicely inside code sculpture.

You're not required to use this though.

But the critical thing to understand here is the philosophy of Python.

4:38

When you want to pick up and learn how to do things

you're gona have to understand how packages written by other people work.

That's a part of the survival skills inside Python.

So, when your here this things says well, why don't you treat me

the right way to plot or the right way to build a GUI.

In Python, there are many different ways to do this and so the ability to move back

and forth between packages is a key skill

that you're going to have to hone in this class.

All right, so let's talk a little bit about how SimplePlot works.

You can find it in the docs here, under

the documentation, you click on Graphics Modules, scroll down.

There's really only two operations you can do in SimplePlot.

You can make a line plot or you can make a bar plot.

The syntax is here.

At this point, you should be able to basically

read the docs and figure out how to do it.

I'll go back and show you an example inside my code her in just a second.

[SOUND] Okay, let's do some plotting.

So we'll return back to our example code.

From here you see my three mystery functions.

Here's my plotting function that actually builds the

plot, and down here what I've down is I've

called that function build plot three times with

a plot size of 40 for each of those.

I pass in the names of the three mystery

functions, and I'm going to do a standard plot for now.

And so here's a critical layout called simple plot, the plot lines.

And one of the things that we really gotta pay attention to

here is we can give it multiple sets of data to plot.

So we give it a list with the three plots here.

So this is a list.

Each of these entries is a list of points.

So, when I click Run.

6:17

This is the plot of the second function.

This is the plot of the third function.

So just by looking at this, we can

kind of deduce some simple properties of these functions.

On the first functions looks like it's linear.

6:28

So what that really means is, that if we

look at the input value, the number of steps

we're taking in this, the, the value, the count,

is going to be a linear function that input that you.

So particular, if we double the size of the input value.

The number of, the value of the counter of

the number of steps in there would actually double.

The second and third functions is a little more unclear.

It looks like they might be polynomial functions.

Maybe it's quadratic, and this one's cubic.

6:55

It's important to know whether a function is polynomial versus say exponential.

For example if a, if a function is quadratic and we double

it, the, the running time just goes up by a factor of four.

If it's cubic the running time goes up by a factor of eight.

But if the function is exponential then we double the size

of the input, we square the running time and just, goes crazy.

So in fact exponential functions we want everybody to

run them for a large input in most cases.

So what we're going to do next is let's talk

a little math and I'll show you kind of a

trick to answer, answer the question about whether these

functions are actually polynomial or not, we're going to use logarithms.

[SOUND] Okay.

During the course of this class you work with two functions occasionally.

Lets go over those functions very quickly here.

You may do word take logarithms and you may do exponential.

So an exponential you've probably see in high school math.

You take some number and you raise it to a particular power.

7:52

We're going to actually use the

natural exponential function here fairly frequently.

Sometime [UNKNOWN] to the xp of n.

And all it really is is Euler's constant, which is 2.718 arrays to the nth power.

Euler's constant is a magic number.

You can look at the Wikipedia page, if you want to understand where it comes from.

But, for, now just think of it as a constant between two and three.

8:13

The second function we're going to work with is logarithm.

We're going to take use the natural logarithm again, fairly frequently.

So the natural log of a number n, we just simply take it to base c.

It's the, it's the power you have to raise e to get n.

8:28

These functions are actually inverses of each other.

Which means if we take the exponential and then

we take the log, we get back the original input.

Or we take the log and then take

the exponential, we get back the original input.

Notice that in some cases you want to work with a different base.

You want to do base 2 or base 10.

The typical notation you will see here is

log with the subscript that corresponds to the base.

Probably the most important thing

to understand about exponential and logarithms

are they kind of obey the following two rules down here.

9:14

With logarithms, it's kind of the complementary thing.

If you have the logarithm of the product of two numbers,

it's simply the sum of the logs of each of the numbers.

Okay, this is a very useful piece of information here.

So next, I'm going to show you how to go

through and actually use this to do log, log plotting.

[SOUND] So, we've taken our mystery functions, we've built a plot of

them, and now we're trying to understand the behavior of the, kind of the,

running time, the value of this counter as a function of this input value.

So kind of a metric for whether you've got a reasonably behaving function

is to ask, is the running time a polynomial of the input value.

So we kind of like to ask, okay, x is the input

value, it's the running time y, is some polynomial function of it.

10:06

So it turns out we can use logarithms

to help kind of understand and answer that question.

I mean here is the trick.

If y is equal to a times x to the n, we take the log of

both sides, we get log of y is equal to log of a x to the n.

Remember the product rule for logarithms says we can break this apart.

All right, this is log of a plus n times log of x.

10:29

So if our data points actually satisfy this relation, y equals a

times x, y equals a times x to the n, then the

data points should have the property that the log of the y

value should be a linear function, of a log of the x value.

10:47

So in fact, this is a strategy we can use when we're actually

plotting our data to figure out if our data lies on a polynomial curve.

We'll take the log of the x, and we'll plot it versus the log of the y.

And if that's a straight line, then guess what?

The data lies on a polynomial function.

And in fact, it's even better.

If we look at the slope of that line, the slope of that line is exactly n.

It's the degree of the polynomial.

11:11

So in fact, this is a trick behind something that

you'll see in lots of plotting packages called log, log plotting.

It's hidden a little bit because what they'll do is they'll take the

labels on the axis, instead of

spacing them evenly, they'll space them geometrically.

You might see this where you'll see the label be 1, 10,

100, 1000, 10 thousand, 100 thousand, a million and they're all equally spaced.

What they're doing is they're basically doing this log trick here.

So what they're doing is they're plotting the data

point by taking the log of each of the values.

That has the advantage of if the data grows really

quickly, you can still make some sense out of this

because instead of getting a wildly growing curve, you get

a curve that's perhaps, perhaps maybe growing as a straight line.

So lets go back now and look at the three functions and do a log, log plot of them.

[SOUND] Okay lets finish off our example.

So, here we are back in our Python code.

12:08

The cool thing is we can actually change that.

And what you'll see up here is instead of plotting the input value versus the

counter, we'll plot the log of the input value versus the log of the counter.

It'll be the log, log plot.

So I can change this to LOGLOG and I'm going to hit Run.

[SOUND] And what comes out here is a plot and what you can see

here is that the LOGLOG plot of this line is again a line.

Probably something more interesting here is our second curve.

Our second curve, the one that we thought

might be quadratic, well, we do the log, log

plot of it, what we see is that,

yeah it's pretty much growing like the straight line.

And in fact if you look at it, it looks like the slope is two.

So in fact, I think this is probably growing quadratically.

This last one though, look, it's not going as a straight line.

I thought it might be cubic, but it's not.

13:06

so, probably this, this last one is, may not be a cubic.

It may not be, may not even be polynomial.

So let's go back real quick and look at our

functions and just kind of see what the mystery functions look like.

[SOUND] So the first one was really simple, it was

just going through and doing kind of just five iterations

of inner loop for every value and every value and

input value range, so that's, that's 5x, it's linear easily.

This one here, if you look at it, it was doing some plotting but at the

end of the day it's kind of two nested loops and this is, this is quadratic.

13:41

This last one actually is, is actually not quadratic.

In fact, you kind of see this magic 1.1 to some power.

It's actually doing a very slowly growing exponential function.

So if we had, had turned out that our mystery function was actually growing

in this rate, you probably wouldn't be able to run it for really large values.

So we could kind of see that when we did

the log, log plot, that we didn't get a straight line.

14:05

So, hopefully this lecture's helped you to kind of gather more knowledge

about exponentials and logarithms and given you some insight into how we could

go about analyzing the running time of our code even if we

can't do some kind of close analysis based on looking at the code.

[SOUND]