0:00

At the beginning of the course we mentioned that a developing ford

algorithm provides the foundation for modern day internet routing protocols

this video is going to supply a few of the details.

So for starters, when you look at the code of the Bellman-Ford algorithm I hope

on an intuitive level it seems like a sort of distributed algorithm in some

sense of the word. Right?

Think about what you need to compute a given subproblem A of I comma V.

Well the vertex V just needs to know from each of the possible previous hops.

So from each of the vertices that can talk directly to V.

What their subproblem solution was on the previous round.

So in each round each vertex in some sense only communicates with immediate

neighbors. Vertices to which it has direct

connection. Now as you can imagine, there's a fairly

long list of engineering challenges that have to be tackled to move from the basic

Bellman-Ford algorithm to a routing protocol that you could conceivably use

in practice. So what I'll do here is highlight some of

the main issues and what's the high level solution to those issues and with those

fixes to the basic Bellman-Ford algorithm, we'll actually have something

that's surprisingly close to how modern day Internet protocols, protocols really

work. That said, the discussion here will be

necessarily brief, and somewhat deliberately naive.

So for those of you who want to understand this material on a really

nitty-gritty level, I encourage you to buy a networking book, take a networking

class, or read more about the topic on the web.

So, I want to talk about three modifications to the basic Bellman-Ford

algorithm. I'll discuss them in order from the most

trivial to the least trivial. The first and simplest modification to

the algorithm is motivated [SOUND] by the fact that routing in the internet is

destination-driven. Given a piece of data floating around in

the internet, you really don't care that much where it came, where it came from.

What you really care about is where it needs to go, right?

It's exactly the same thing as with snail mail, the way it gets routed around the

globe. And if I'm on vacation in, say, Croatia,

and I put a postcard in the mail, I don't even need to put a return address.

I just say the address of my friend who might be back in the states.

[SOUND] And then, just based on its destination, this postcard get routed

appropriately. First through local hops within Croatia.

Then on a plane over the Atlantic Ocean and then again locally within the US

network. Same thing on the internet, based on the

destination IP address of a packet, that's how you know which intermediate

sequence of routers it has to go through to get to its eventual destination.

All you have to do to accommodate destination driven routing is reverse all

of the directions in the Bellman-Ford algorithm.

So instead of having a source vertex, S, out of which you compute all shortest

paths, you're going to have a destination vertex, T, into which you compute

shortest paths from all possible origins. Each vertex, rather than storing a

predecessor pointer, the final hop on a shortest path from S to that vertex, it's

going to store the first hop on a shortest path to the destination, t.

Now of course if you're a router in the Internet, you don't want to be optimized

purely for a single destination T. You have to be ready to accommodate data

down for anywhere in the Internet. So as a result these vertex stores, this

shortest path distance in the first hop, not just to a single destination T, but

to every relevant destination t. So it's prepared for any data that it

might acquire. So this should sound pretty intense.

There's a lot of potential destinations of T, there's a lot of IP addresses out

there in the world. So a couple comments.

So first of all, for most computers and the internet they're not really

responsible for knowing how to get to a lot of destinations t and the Internet.

This is cause of the hierarchical structure of the Internet,

right? So if it's just some computer inside the

stanford network, it really only needs to know how to get to other computers inside

the stanford network, of which there aren't that many.

And it also has to know how to get to the stanford gateway router.

And if it's data down for somewhere outside of the Stanford network, you just

sort of defer the responsibility to the stanford gateway router.

And you let it handle it. You just get that far, and then it will

take it from there. On the other hand, for the routers

embedded in the core of the Internet, they really are responsible for knowing

how to get to all kinds of different places.

Basically the entire Internet. And their routing tables, guess what,

they're really quite big. So the number of entries might be say in

the hundreds of thousands. So as you can imagine, routing table

implementation is, is something that people who work in networking have

thought about a ton. There's interesting hardware

optimizations, software optimizations and then people also worry about how can you

just make sure routing cable size doesn't get too big.

So, for example, you want to avoid the fragmentation of IP addresses to make

sure that you don't need that many distinct entries in the routing tables.

Hundreds of thousands is plenty, thank you very much.

You will sometimes hear Bellman-Ford based shortest path protocols like this

one, called distance vector protocols. The distance vector that this term refers

to, is, at a given vertex, you have a vector indexed by possible destinations

t. And you're keeping track of the shortest

path distance in the first hop for all of those destinations.

So, the second issue we need to address is a little more serious, but it's still

not too bad. This issue is asynchrony.

And what I mean is if you look back at our basic Bellman-Ford algorithm it was

synchronous in the following sense. We were careful to structure the outer

iteration of the four loop so that all of the sub problems corresponding to value i

- 1 gets solved before any of the sub problems with index i.

Now when you're talking about the internet and you have different routers,

different computers running at different speeds.

You have different physical connections with different bandwidth.

There's no way you're going to keep people in sync.

There's noway you can implement synchronized rounds at the Internet

scale. But what's cool is that the Bellman-Ford

algorithm is surprisingly robust, to the order in which you solve its subproblems.

Really if you solve them in any kind of reasonable order, you're still going to

be computing correct shortest paths at the end of the day.

So to explain what I mean, let's change the Bellman-Ford algorithm to be

push-based rather than pull-based. So the basic Bellman-Ford algorithm is

pull based in the following sense. For each other iteration I, and each

vertex V, in this iteration, the vertex V in effect asks its neighbors, for their

most recent information. So, for there, sub-problem solution

values, from the last iteration of the outer four loop.

We're going to change it to be push based, which is rather than asking your

neighbor for the latest information, whenever you have something new to say,

whenever your sub-problem value changes you're just going to go ahead and tell

all of your neighbors. You're going to push that information

onto your neighbors whether they want it or not.

So, I'm not going to define this especially formally.

It's easy to do, let me just show you a very quick example which hopefully gives

you a gist of the idea. So, consider this green network and

suppose you're trying to compute shortest paths from everywhere to t.

Remember we've switched form source driven to destination driven routing.

So, initially t knows how to get to it'self with link zero and everyone else

only has plus infinity. So to get started the destination t is

going to inform all of its neighbors that it can get to itself with a path of

length zero. So who does it notify?

It notifies V and W because U is not directly connected to T.

U does not learn this information yet. So because the Internet is asynchronous,

even though t probably sends out messages at exactly the same time, to v and w,

telling it about its cool zero path from it to itself, who knows, which of v or w

gets that information first. So maybe, the line speed is faster to w,

and w finds out first, the t can get to itself with a path of link zero.

At that point, w says well cool, I didn't have any path at all previously, but I

can get to t with cost only four and once I'm at t I'm done.

T can get to itself with cost zero. So, w updates its shortest path estimate

to the destination t from plus infinity down to four.

So now remember we're doing this push-based implementation.

So W has new information, it has a better shortest path to T.

So it needs to tell all of its neighbors. In particular it's going to tell, the

neighbor U that it has a path of length four to T.

And now remember still floating around in the internet is this message from T to V

advertising T's empty path, from itself, of length zero.

But then who knows what happens first. Maybe in fact W's message to U arrives

before T's message to V arrives. So then U's going to say, Oh, cool.

Well, if from W to T has cost only four, I can get to W would cost only three.

So, that gives me a path of like seven all the way to t.

Now in this graph, there are no incoming arcs to U.

So U doesn't have anybody to notify. And now at some point in the future, the

message from T actually arrives at the node V.

And so at that point V says, oh, okay cool.

So I can get direct to T on an edge of cost two, from T I only have to pay zero

to get the rest of the way to T. So I'm going to update my estimate of my

distance to T from plus infinity down to two.

V now that it has a revised estimate, it's responsible for notifying all of

it's neighbors. So it tells U, it says hey U, I can get

to T on a path of length only two. And then U says well, hey great, that's

actually an improvement. My old path had length seven.

I can get to the node V with cost only one and V tells me it get, get the rest

of the way with cost only two, so that gives me a path of length three all the

way to T. So in this particular example, this

asynchronous, push-based implementation of Bellman-Ford did correctly compute

shortest paths, and that is in fact true in general in any network.

And of course when I state this fact, I'm assuming that there's no negative cycles.

In the context of Internet routing, actually, negative cycles aren't a big

deal. Usually you think of all the edges having

non-negative length in internet routing applications.

Why does the algorithm converge eventually?

Well essentially it's because every single update strictly decreases

somebody's estimate of their shortest path distance to the destination T.

And there's only a finite number of these configurations you can go through until

you finally must be at the correct shortest path distances.

So this very crude convergence argument only provides an exponential upper down

on the, on, on number of updates that you might need before you correctly compute

shortest paths. And in fact, at least in the worst case

theoretically this asynchronous version of Bellman-Ford can require an

exponential number of updates before it successfully finds all of the shortest

paths. That's a nontrivial problem you might

want to think about. So, if you have the luxury of

implementing the Bellman-Ford rhythm in a synchronous or centralized setting, you

do want to use the synchronous version that we discussed in the basic version.

If you're in an asynchronous setting you don't really have a choice other than to

implement it asynchronously. And then of course if you're hoping

you're going to in your particular network have convergence time much better

than the worst case, exponential bound. So the original Bellman-Ford Algorithm is

from the 1950s, and with the modifications we've discussed to this

point, we're up to routing protocols that were deployed as late as the 1970s.

So if you want to read more you can look at the RIP and RIP2 Internet routing

protocols. And if you really want to be nerdy about

it you can check out the request for comments related to these protocols.

RFC number 1,058. So what is an RFC?

What's a request for comments? Well this is the mechanism by which the

community around the Internet sort of vets proposed modifications proposed

protocols. So if you really want to see nitty gritty

details those that's a good place to look.