So it's a beautiful Saturday here in Cape Town.

I'm walking here on the campus grounds,

and I just want to tell you a little bit

about this new module.

So this module then on the use of Julia for statistics.

I'm not going to explain a lot about

statistics although I'll mention one or two things.

But I want to show you how easy it is

to do your statistical analysis in

Julia and how powerful

Julia is to do your statistical analysis.

So how to approach normal statistics using Julia.

I am in JuliaBox here from Julia Computing,

and we see that the kernel

that we're running here is Julia

1.0.0 as of the time of recording of this video.

Now what I always like to do,

especially during this transition

phase to the new Julia 1.0,

is just some a little sanity check.

I just do 2 plus 2 and hit

Shift and Enter just to make sure that

the kernel is running and I see it's 4.

I know things are working in these uncertain times.

So let's have a look at what we're going to deal with.

In this section, a quick introduction,

we're going to talk a little bit about adding

packages right here in JuliaBox specifically.

We're going to create some random variables.

When you start working with statistics,

you might not have a proper dataset to work with,

although there are so many datasets available online,

free and open datasets,

but I'm going to show you how to simulate your own data.

I think that's very important.

So we're going to look at creating

some random data point values for variables,

otherwise known as random variables.

We are going to look at descriptive statistics.

I believe that when you see a big table of data,

obviously for us as humans,

it's impossible to understand

what the data is trying to tell us,

and every dataset has a story.

There's some knowledge hidden in there,

and you have to bring that knowledge

out through the use of statistics.

Now the first thing to do is for

us humans at least not being able to see that story

when we look at this large data set of

rows and rows and columns and columns and

columns of values is to summarize

it in some way and that's through descriptive statistics.

We take all those values and we represent

it by single values that mean something to us.

Then the simulated data that we've created,

we're going to change that into a dataframe.

That's I think a proper thing

to do because most of the data that

we do work with is inside of a database or a spreadsheet.

If it's in a database, at least be

extracted into a flat file,

into a spreadsheet,

like Microsoft Excel and then we import that.

When we import it, it's imported as dataframe.

So we're just going to take the data that we've

simulated and we're just

going to create a dataframe from that.

Once we have this dataframe,

we can look at descriptive statistics

using this dataframe,

which is going to be slightly different from

just having computed variables with the list objects.

Then we're going to visualize that data,

and for me that's always the second step.

After describing the data,

I want to visualize it because

that visualization is also a very good way for

us as human beings to understand

this hidden knowledge inside of our datasets.

Then we're going to have a brief look

at inferential statistics,

and we're going to do the common parametric tests

and some testing for categorical variables as well.

Then, finally, we're just going

to just export our file that we created,

our dataframe that we created as a CSV file.

So it's easy enough to import it,

but I just want to show you how to export it.

So this is by no means

a whole introduction just to statistics,

although we're going to look at the basics of

statistics and more importantly how do we

use Julia to approach solving

this problem of doing statistics.