>> That was our first step. >> Now, we get a lot of these volumes.
>> So, we basically go into our three-dimensional data, that we got from
the tomograms. >> We found, so we have the
three-dimension data. >> We go and look for all the areas
with the virus, and we segment out those spikes, or envelopes.
>> Sometimes that's, that done by hand. >> Basically, somebody goes and places
a box around each one of them. >> Sometimes you can do that
automatically if you have a model of what, more or less, you expect.
>> Then you travel around your image with that model.
>> And you do, a distance at every place.
>> And if the distance is small, you say, oh.
>> It looks like there is 1 of these envelopes here.
>> It's very similar to what we did, we did with non local means.
>> In non local means, we were looking for similar patches.
>> Now, the patch is a model that you have and with it, you go wrong.
>> So, either one of those ways, either manual or with this semi-automatic, you
get this volumes. >> This volumes can be like a
100x100x100 pixels representing, basically, the envelope inside this spike
that we have. >> Now remember, these are noisy,
they're not all the same. >> We need to group them, we need to
align them, and that's what we're going to basically do.
>> We are going to align them first, and there still very noisy.
>> So the alignment is not going to be great because you're trying to align
things that are very, very noisy. >> but it's a good step.
>> You can actually, only if you want, align those that are.
>> Basically, it put a very, very high threshold and very high level of
confidence in the alignment. >> And you say I'm going to align them.
>> Only those that basically after I align, the distance is almost zero.
>> They're almost identical. >> So, then I have more confidence that
I have than. >> Good alignment.
>> Once again alignment is you start and just rotate them until the distance
that we show in the previous slides becomes the smallest possible.
>> Once you have them aligned, then classification becomes a bit easier.
>> So you say okay now I have them all aligned.
>> I have them all in a common space. >> Now I want to group them to try to
only put in the same group those envelopes that basically have the same 3
dimensional shape. >> And there are many ways of doing
this classification or clustering that we need there.
>> But let me just illustrate to you one example.
>> So you have all of them aligned. >> And now you have the distance, the
distance that basically led you to the smallest raw, the smallest alignment
rotation. >> So you have, you align them, you
rotate them, you align them, you have the distance.
>> And now, you basically, how do you group them, you.
>> Group all of them. >> So you pick one, and you say, hey,
bring me all my friends. >> Who are my friends? Don't have very
small distance to me. >> That's 1 group.
>> Then you pick another one. >> And say, bring me all my friends.
>> And who are the friends? Those that I go a very small distance.
>> So you basically group them. >> Once you group.
>> If you did everything right, you can basically average them.
>> And we know that the average would reduce the noise.
>> That's what we taught before. >> We explained when we were doing
image denoising. >> So you can kind of denoise them a
tiny bit. >> And after you denoise them a tiny
bit, you can say, hey, let's try to align them again.
>> Now, there are a bit more clean images.
>> So we are going to align them again. >> We are going to cluster again.
>> We are going to align them again. >> And we are going to cluster them
again. >> And you repeat that for a number of
iterations until you're satisfied. >> Either that things are not changing
too much. >> Or basically you say, this is the
best I can do, so I'm going to stick with that, and that's a process.
>> With the final distance, with alignment, clustering, alignment,
clustering, and the big thing here, once again, and we're going to come again and
again to that, if we have a lot of this >> Small boxes.
>> Now, I want to reiterate to you, all this is done in three dimensions.
>> These 100x100x100 cubes, we have like 4,000 of them.
>> Sometimes we have 10, 20,000 of them.
>> So, a lot of computations. >> We need to align everybody with
everybody. >> We need to compare everybody to
everybody. >> We need to do that many, many times.
>> You can speed up these using different, basically, tricks.
>> One is for this particular case, going to Fourier domain.
>> Normally these things are implemented in GPU's and, and very, very
efficient implementations to make them run in hours and not in weeks.
>> So that's what we have that more challenging, the most challenging part,
computationally is this alignment part because we have to basically try all
these angles, and that's normally done in the, in the Fourier domain.
>> For these types of examples, as we say, because of the missing wedge and
because of the, the speed. >> Now, you have an algorithm.
>> You have, you've developed this image processing algorithm, and you're
going to use it to detect the three-dimensional shape of the envelope
in the HIV virus. >> But then you say to yourself, how do
I know if my algorithm works? Nobody has ever seen the envelope at the resolution
that I want to see it. >> So, how can I validate? And this is
very important when you're talking about medical.
>> Imaging. >> Because you don't want to make
mistake and mislead, let's say, a complete vaccine development because you
made a computational mistake. >> So, what you do in most image
processing, but in particular when you're talking with medical imaging, is you go
slow. >> First, you start from what are
called phantoms. >> You create three dimension shapes
that look like viruses from your prior knowledge.
>> You add noise. >> You project them.
>> You make missing wedges. >> So you make them like they were
acquiring the microscope. >> And you run your algorithm for that.
>> You know the grand truth, because you created it.
>> You hope. >> To reconstruct.
>> And to you don't finish that, you don't move up.
>> So, you solve this. >> You say, great.
>> I put different structures. >> Phantoms I created.
>> I rotate, add noise, add missing wedge.
>> I got them back. >> I'm happy.
>> The next step, because it's almost impossible to do a perfect simulation.
>> Of everything that is going in the microscope, is you take particles that,
for some reason, we know their structure. >> Maybe because they are larger than
the HIVs, maybe because they are more rigid.
>> And they could be acquired with other technologies.
>> And one of them is what's called the GroEL, the GroEL.
>> So, you go into your microscope, you put GroEL, you run your algorithm.
>> You get the 3D shape that everybody knows is the right shape.
>> You say, okay, this is as much validation as I can do.
>> Now, I'm ready to do HIV. >> Even here, you can do a lot of
validation. >> For example.
>> You can take these 4000 samples and drop 500 of them.
>> Do your computations. >> Then drop a different 500 at random.
>> Do your computations. >> And you expect to get very similar
results. >> So, still here you have to do a lot
of validation. >> But this is a regular progress:
phantoms, no data, and then you go into the unknown.
>> Which is the HIV. >> Remember, we are looking for these
tiny spikes here, and which are, we're talking about basically, arms strokes