By disguising himself as Bull Demon King, Wu kong attained the fan sent from the princess. He headed back to flaming mountain and started planning how to blow out the fire. But there were now 100 spot fires breaking out. Wu kong had to use a magic spell to summons some clones to help him put out the fires. Each spot fire required different amounts of the clones effort and time to extinguish. The fires tended to rise, when attempts were made to blow them out. Therefore, the spot fires in lower positions had to be stopped before those in higher positions. Using astrological calculations, Zhuge Liang noticed that the fires were weaker at a certain time of day. When this occurred, it was the best time for Wu kong to start blowing them out. Taking all factors into consideration, Wu kong wanted to come up with a plan to start blowing out the fires, as close to the ideal time as possible. Zhuge Liang, summoned the dragon of dimensions for assistance from the professors. All right. Son Wu kong has finally got the iron fan, of princess iron fan and now he can put out the spot fires on flaming mountain. And the spot fires appear in these tens of lines up towards the center, where the fire is the most intense. Now, the way he's going to do this is, he's going to make clones of himself and because of his weakness after all these fights and inventions he's gone through, he's only able to make 43 different clones of himself at the moment. And now, each of the spot fires has different requirements. So he's going to need a different number of clones, to tackle it. And he's going to need some amount of minutes. And that'll be for each of these different fires, there will be different number of clones required and a different amount of time required to put it out. And there's some simple physics involved, because these fires appear along these ridges in towards the mountains. Of course, we need to put out the bottom fire before we put out this fire. Because if we put out this fire first, then this bottom fire could relight it, just using, because it's going uphill the flame. And so we can put out this fire, once we've done that, we can put up the fire above. So there's basically precedence constraints amongst these fires that we would need to put out. All right, Zhuge Liang has looked at the stars and also understands that the fires actually vary in intensity over the day. And there's a period, just a kind of ideal period for each fire, where it'll drop down in intensity making it easier to put out and that's when we have to attack it. And for different fires it will be different times. And so basically, we want to be out of putting out the fires as close to this desired time as possible. So that leaves us our firefighting problem. We've got 43 Wu kongs. A 100 spot fires each requiring a different number of Wu kongs and a different duration to put it on, put it out. And at any time of course, we can't exceed the number of available Wu kongs. So we can move them all around, but at any one time, we can only have 43 working on various fires. And we have the precedence constraints that a low altitude fire, has to be put out before any higher ones can be put out. And we have this objective, so each spot fire is weakest at a certain time of the day, so this is the best time to fight it and we're trying to minimize some of the deviations from this best time. So, what we're trying to do is get as close to that best time for each of those fires. All right, so let's look at the model. So the first part of the model is just reading in the data. So here's our number of Wu kongs, the number of spot fires. And then for each fire, we've got its durations or how long it takes to put it out. The number Wu kongs we require and the best time to put it out. And then we have precedences between these fires. So we've got a number of precedence pairs and the pre and post array. We've seen this many times before and basically, we need to put out the pre-fire before the post fire. And of course, there's a maximum amount of time we're going to need. Obviously, if we just put out each fire one at a time, that will be the maximum time so that's just the maximum time we would ever need and then our decisions right. So basically, the key thing is for each fire, when do we start putting it out. And then of course we have the end time for that fire, which is, if I start putting out that fire at this time and then we'll add the duration of how long it takes to put it out, that's going to be the end time for that task. All right, now our constraints, our precedent constraints is fairly straightforward. Basically, the end time of the previous constraint has to be less and equal to the start time of the post constraint. That just means that we're going to finish this pre task before this post task begins. Most of the dealing with the Wu kongs comes is just a simple cumulative constraint which says, if these tasks could start times in these durations and this amount of required Wu kongs for each of the tasks, we can only have at most w running at any time, being used at any time. And now our objective. So this is a little different than objectives we've seen for other kinds of scheduling problems in the course. And here we're just looking at for every fire and calculating the absolute value of the difference between the when we actually started from the best time that we started. And our idea is to minimize that objective. All right. So, we're going to work on a small data set with only eight Wu kongs available and 10 spot fires with this data. And a few precedences which are here, six precedences. Remember we are talking about local search. Now, a key step in the steepest descent local search, is that we are in a current state now and we need to find the best neighbor to the current state, that satisfies any constraints that require, to be satisfied by every state that we visit. So you can see that this problem itself, is in fact a constrained discrete optimization problem. I'm really looking at all my neighbours and find the best one that satisfies the constraints. So, we've been looking throughout this course at methods to solve constrained discrete optimization problems, can we use those to solve this problem? And the answer is yes. So, large neighborhoods is the way we do this. So usually up to now, we've been talking about small neighbourhoods. So we basically only look at a very small neighbourhoods. So very few neighbours, and the way we explore and solve this problem is by brute force search. We're just going to look at all of our neighbours and we're just going to pick the best one. That's fine, but we can also deal with large neighbourhoods. And then we can use the methods, other methods that we've seen to explore large neighbourhoods already, by constraint programming or mixed integer programming. And this is what gives rise to this idea of Large Neighbourhood Search. So because we don't want to do it by brute force anymore, we're going to use these techniques, which are exactly designed to solve these kinds of problems. So how do we specify a large neighborhood? So an example and the one very frequently used, is given the current state D, that's where we are now. We're trying to look at neighbours of this D. One way of doing that is to take all the variables in the problem and fix K percent of them to their current value and leave the rest unfixed and then you can think about these unfixed variables. That's the new, that defines neighborhood. Any other values for these unfixed variables and so solving this problem is like solving a reduced version of our original problem. Because really, we're solving exactly the same original problem, except we've fixed some K percentage of these variables to a fixed value. So we've just made this problem smaller. So you can do many other approaches. You can define a neighborhood as the states with the most K changes in variable values. That's also used for some kinds of large neighborhood searches or other kinds of hybrid searches. Or we can fix all but a particular set of a related variables and we can look at our particular problem and find problem specific neighborhoods. But the idea in large neighborhood search is, we're not going to limit ourselves to just a small number of neighbors, we're actually going to make a large neighborhood and then we're going to use some of these Discrete optimization solving techniques that we've seen throughout the course, to solve that for us. So, one of the advantages of large neighborhood search, well, because we have a large neighborhood, we can see further, we can move further, we can see a larger neighborhood, and we can find a better place. And that makes it harder to get stuck in a local minima because if I'm only looking at people immediate next to me, I might look like a local minima, but when I look further apart, I can see a better place to go. The other thing is we can handle arbitrary constraints. So as long as the solver already can handle arbitrary constraints, so we're going to use the constraint programs solver or the mixed integer programming solver, then we can handle them because we're just actually mapping this large neighborhood search problem into a problem for these solvers. So, we've got some concerns, right? What happens if the neighborhood's too large? Then what's going to go wrong? We're going to spend a lot of time, maybe get stuck just looking in one neighborhood. What happens if the neighborhood is too small? What if we end up defining a very small neighborhood, then we're not going to get that much advantage out of the brute force approaches. What about if we're in a local minima? Even when we do use a large neighborhood, is there ways to escape that? And, of course, all of these methods, we need to get an initial feasible solution to start off this process because where we want to fix the variables to their current value d that's in the current best solution we have typically. All right. So let's look at a basic large neighborhood search minimization algorithm. So we're going to get an initial valuation, sum method. And then we're just going to go around this loop until we've run out of resources. So it's a local search method. So we're not going to prove optimality. So we're just going to keep going until we said, "Well, that's enough resources." And what do we do? We take our current solution here, and we define some neighborhood a bit battered, and then we explore this neighborhood to get the best solution we can find in that neighborhood. If that's better than our current solution, then our current solution becomes this new place. And we basically going to have to put some limits on this neighborhood exploration in general, in case it's too large. So we're going to use limits both here and explore neighborhood and in this stopping condition. All right. So if we look a bit more detail inside those, this defined neighborhood defines an area N to explore around this current solution point d. And it can be different in different iterations even for the same d. So remember that when we're doing large neighborhood search or any kind local search, often we will not find a better position. So it'll often be staying in the same point for a long time. So we need this to change. We don't want to get the same neighborhood we've already searched. So we want to be able to find a different neighborhood every time we apply this, even if we have the same input solution point. And as I said before, it's often specified by unfixing a set of variables. Then we've got this exploration around this neighborhood. And that's basically searching for an alternate solution e inside the neighborhood. By default, we would just be looking for the best thing in the neighborhood. So basically that makes it a discrete optimization problem. We can vary how that search is done. And in LNS, we usually perform it using either a CP solver or a MIP solver. So one of our complete solvers that's capable of solving these kinds of problems. All right. So, as I said before, the first step in a large neighborhood search is to get an initial feasible solution. And the obvious way to do this is let's just ignore the objective for the moment and solve a satisfaction version of the problem. And we can do this fairly easily for a scheduling problem, just a greedy method where we just take, we look for each start time, we find the earliest start time that's possible, and we set it to that minimum value. And that will give us a resulting schedule. It might be very poor, but what we're doing is we don't really care that much about the initial solution, we're going to rely on large neighborhood search to improve it to give us a much better solution. So here's an initial feasible solution for our small version of the firefighting problem. And so you can see the start times of when we start. So these are the ideal times for each of these fires, and then you can see the purple fire, we're fighting here far away from it's ideal time. And if we look at the objective here, you can see it's quite high. So here are 10 different fires. Here's the objective for the current initial feasible solution. So what are we going to do? We're going to fix 60 percent of the east to their value randomly. So if you look at the ones we're fixed, so they are still here in bold. Six out of the 10 are going to be left where they are, and we're going to move the others around. So you can see we've added a pin to the ones which are fixed, and there's only really four tasks that we can move around. So we are really going to solve a scheduling problem over these four tasks. Solving exactly the same problem as we have, except that we're fixing six of the 10 tasks so they can't be moved. So we've made the problem much, much smaller. So if we do that, we find a better schedule. So we found this schedule with objective of 40. So, obviously, we're going to keep that. We're going to move on. Then we fix another 60 percent randomly, different 60 percent now. So you can see the ones which are being grayed out. And we freed them up, but the ones which are still in solid colors are pinned down. They stay where they are. We solve that problem. And, again, we now get objective of 30. It's a better of schedule. We're going to keep that. Next, we fix another 60 percent randomly. You can see which ones have been fixed. And we solve that. We get a schedule which is no better. So we going to reject that. So we're not going to move because we didn't get any better. We'll stay where we are. All right. So we'll return back to the last schedule. This is still our best schedule at 30. We could have closed because they are equal. We could have chose to move. This is a possibility in large neighborhood search. Now we can fix another 60 percent randomly and solve again. We get a better schedule even further, objective 28. And, in fact, we are now at a local minima. So, whatever a set of six things that we fix, and we free up the full, we'll actually find nothing better. And that's because actually this is the global minima. So we've actually found the best solutions, so obviously were not going to find any better solution in any neighborhood. All right. So just like any other local search method, LNS can't prove optimality. So we arrive at this local local optima. We don't even know it's a local minima with respect to all neighborhoods. We only know it's a local with respect to neighborhoods we actually checked it on. And so that's why in practice, we would just keep running, trying different neighborhoods, trying to find a better solution, in the end give up and just return the best answer, which in that previous case would have actually been the optimal answer. All right. So, how can we define neighborhoods? So we don't have to just use this random approach. We can be problem-specific. So we can make use of our knowledge about the problem to build a neighborhood, which is going to work well for that. So, we've got to think about which variables we want to unfix, which decisions do we want to be able to redo in a different way. So we can take a set of variables that are all related to one part of the problem. That's an obvious way of building a neighborhood, which is meaningful. So it means we basically get to solve that part of the problem again, or we could have, say, a set of variables where we see they're giving us a really bad value with respect to the objective value. So, obviously, if we can get a better value out of those, a better solution out of those, we should be able improve the objective value more. So if you think about monkey formation problem, we can think about neighborhoods defined by unfixing a sequence of positions. So we might just unfix positions 1 to 3, positions 2 to 5 or 4 to 6. So, because we knew that those are related to each other. And in the monkey formation problem, we could pick a neighborhood by looking at where we get the worst contributions to the counter objectives. So, remember in the monkey formation problem, the objective is basically the sum of how well two monkeys which are adjacent to each other fight. And here we see well, there are two spots here where we're not getting much value out of our solution. Here, we've got a value zero. Here, we've got a value of two. So B, C, E and F are examples of good things to be unfixed so we can try and move them around and get a better answer, those positions. Alright, another example of problem specific neighbourhoods is for example in vehicle routing. So, we have in a vehicle routing problem, we have to pick up tasks with a location and a time window and we have to assign those tasks to trucks. And then the kind of neighbourhood possibilities are we'll assign, unassign all the tasks for a single truck and that basically means obviously that all will be assigned back to that truck because all the other trucks are filled, but allows this truck to optimize its schedule. Basically it's just allowing a single truck to optimize its schedule. Or we could unassign all the tasks in a time window, and that means that all the trucks which are carrying things in those time windows, they can swap their jobs and maybe get a better separation of the tasks to trucks. Or we could unassign all the tasks in some geographic region and that means that any trucks that's entering into that geographic region to do some task has the ability to swap with another truck to get a better result in the end. And indeed an important use in problem specific neighbourhoods is we can actually have an adaptive selection of neighbourhoods. So we might have all these three different kinds of neighbourhoods in our vehicle routing problem LNS solution, and then we'll find that well some of them are often finding us better solutions, and so we'll tend to focus on the ones that have typically find these better solutions and spend less time trying ones which perhaps don't find better solutions. So this is adaptive, the selecting the neighbourhoods by looking at how often that neighbourhood is successful in finding us a better solution. Of course generic random neighbourhoods are also surprisingly good. Alright, so random neighbourhoods we just say for every variable we fix it to its current state with a probability of k and then we solve for the unfixed variables. And this is surprisingly good if we pick these percentages k correctly. So if it's too large, so if we basically fix almost all the variables, there is no freedom left to search and no improvement can be found because we're just looking in too small space. Alright, so we are just going to stuck in as many local minima. If it's too small, then we actually end up with too large a neighbourhood and the search takes too long. And obviously if we set k to 0%, then we are just solving the original problem. Alright, so we need to find a balance between where should this k be. It shouldn't be too large, it shouldn't be too small. So we can actually do adaptive random neighbourhoods. So we'll start with some k% fix rate, let's say 90%, and we'll have this maximum number of search steps that were allowed for any neighbourhood which we are going to need anyway because we know that sometimes the neighbourhoods just won't get us anywhere that gets stuck and we just want to stop them and try a different neighbourhood. And then we do this kind of thing, we generate a random neighbourhood using the current percentage k and we will solve the LNS problem for at most S steps, which we needed that band anyway, so that's nothing new. Now, if we solved it completely, if we know that we've looked at everything in the neighbourhood in N steps, when N is much less than S, then basically that's telling us that our neighbourhood is too small, right? We could spend more time looking for a bigger neighbourhood. So we make k smaller which means we fix less variable, which makes the neighbourhood larger. If we don't solve it completely in S steps then it means that basically we've got a neighbourhood which is too big to solve in these S steps and so we should make k larger, that is fix more variables and make the neighbourhood smaller. And if we do this, these adaptive neighbourhoods are robust because they are basically adapt this k value to the problem that we are looking at and they are often quite hard to beat. And that should be a starting point for any kind of large neighbourhood search because it's very easy to do, obviously. So we have variations, so as we presented large neighbourhood search is a steepest descent local search. So when we look in the neighbourhood, we're looking for the best solution, right? So if a neighbourhood is too large then it may not be worthwhile doing this. So, an obvious variation is Greedy LNS which basically looks in the large neighbourhood and it finds a first solution which is better, it just stops, right? So it will either find a first improving solution or it will run out of resources and there is no solution, no better solution we found nothing but it means it can often in a large neighbourhood we can find first solutions much, much quicker than we can prove that we found the best solution in that neighbourhood. So we are basically committing to the first solution, continuing this LNS loop, and that's often a very powerful approach as well. Both approaches their have advantages and disadvantages. So large neighbourhood search in MiniZinc, so some solvers for MiniZinc support large neighbourhood search and you can do this through Gecode using the solve item so attaching to the solve this relax_and_reconstruct annotation. So this takes an array of variables and this k fixed percentage as an integer. And what it says is every time we restart, so we need our solver also to be restarting, so restarting is just remember, keeping track of how much resources we want to use for every subsolve. We fix to this percentage fixed values of the k list so we fix them and then the remaining ones are unfixed and then we search and any variable which isn't mentioned in here, by the way, is also unfixed. So basically often we'll have like the key decision variables. These are ones will go here. All the other variables, the auxiliary variables will all be unfixed on every time so we get a new chance because often we don't want them to be fixed. If we fix them, we limit what we can do with the variables, the other variables. So, of course this will use a search strategy if it's already in there. So we can have two annotations on the solve item, one for this relaxed reconstruct and one for a search strategy. And now, since we are using restarting, we almost certainly want this to be an adaptive or random search strategy. A fixed search strategy which is just taking the same variables in the same order every time and saying that the same value is not really going to gain from this or we still will actually gain something because we'll fix different sets of variables so we will actually certainly do different things. But typically it's more beneficial to take some kind of adaptive or random search strategy when we are using large neighbourhood search. Alright, if we look at the power of LNS, in MiniZinc we can look at the Fire Fighting problem here, the full version where we've got a 100 spot fires, 43 Wu kongs, and if we run it with a 15 minute time out using a default search, then we just get a best solution of this value 79000. If we restart search, and we have seen already that restarts can be very powerful, we can find a better solution 69000. But if we do LNS search, using the same restart, the luby restart, starting off with 70 percent fixed we find this 61000 solution which is much better and indeed you can see actually we find better solutions, much better solutions than the default search very early on. We don't have to wait 15 minutes to get better solutions, we can see after one minute we got 67000, two minutes 64000, after three minutes 62000. You can see that in the last 12 minutes we're not gaining that much. But it shows you how large neighbourhood search is driving us very quickly to good solutions and that's the power of large neighbourhood search. So in summary, large neighbourhood search is a very powerful local search method because it's based on complete search methods, and it's able to handle arbitrary constraints because they already handle arbitrary constraints. Although sometimes, remember we still need to get these initial solution, if it's so hard to find an initial solution, we might want to convert some of those constraints into penalties the way we've seen in other local search methods. And it scales to complete approaches, it basically scales these complete solving approaches to much bigger problems because they never actually solve the whole problem at once. They now are just fixing most of the variables and they've got a little version of the problem and that's where they work and so they can scale the much bigger problems. And random neighbourhoods are very robust so if you are ever doing an LNS, then start off by using random neighbourhoods because you can do them without any hard work, but it can certainly be bettered by specialised local search methods for specific applications. And if you see works in, for example vehicle routing, there is lots of interesting specialized neighbourhoods which are used in that domain.