How Bad Is Being Greedy?. An assessment of a greedy approach to… | by Jarom Hulet | Aug, 2023


Exploring a greedy solution to the stock cutting problem

Contents

  1. Motivation of the stock cutting problem
  2. Quick overview of NP-Hard problems
  3. Encoding the stock cutting problem into Python
  4. The greedy algorithm
  5. Comparison to exhaustive search in low n-space
  6. Comparison to random search in higher n-space
  7. Conclusion

Motivation of the stock cutting problem

I’m a data scientist by trade. While my data science skills are very important at work (by definition), I find that data science concepts help solve a lot of problems outside of work too!

One of the ways that my DS skills come in handy is in my hobby as a DIYer/maker. A common challenge is knowing how to plan for cutting material. I have a list of cuts I need to make from multiple pieces of same-sized material from the store. How does one plan the cuts in a way to waste as little as possible? This challenge is known as the ‘Stock Cutting Problem.’ As it turns out, this can be a really hard problem to solve, NP-Hard in fact!

In this article, I’ll explore a ‘short-cut’ (pun intended) to solving this problem and analyze how this method compares to the long way.

White oak top, steel base table I made last year — image by author

Quick overview of NP-Hard problems

I’m not going to go into very much depth about NP-Hard problems here, but I do want to give some intuition on it. What makes an optimization problem hard is typically the size of its solution space. Meaning how many possible solutions do you need to explore to find the best one? The difficulty of a problem is usually assessed by how fast the solution space grows as the problem size grows.

For the ‘Stock Cutting Problem,’ the problem gets bigger, much bigger as you add more cuts. See how fast the solution space grows below!

Image by author

The solution space grows factorially, meaning that the total number of solutions that must be searched to be sure of the optimal answer is n!, which as you see, gets big really, really fast.

NP stands for ‘non-deterministic polynomial,’ which means that the problem grows faster than any polynomial function. There are tons of resources that dive deeper into the NP/NP-hard problems. This is all I will discuss here.

Encoding the stock cutting problem into Python

The stock cutting problem essentially boils down to an ordering problem. You make the cuts in a specific order, and when you run out of length on a piece of stock, you start cutting (still in order) on the next piece of stock.

A visualization explains this best. Let’s say we have this order of cuts : [4′, 2′, 1′, 2′, 2′, 1′], and we have stock sizes of 5′. The waste calculation looks like this:

Image by author

Here we have a total waste of 4′.

But, if we change the order (keeping all of the same cuts), we can get different levels of waste. Let’s try [4′, 1′, 2′, 2′, 1′, 2′]:

Image by author

Here, we only get 3′ of waste — simply changing the order reduced our waste! This is the whole idea of this optimization problem. We want to find which order of the cuts is best.

Now, to encode this into a Python script, we need (1) functions to calculate the waste of each order, and (2) an algorithm to sort lists into optimal orders.

The function to calculate waste just replicates the logic outlined above. Here is the Python code:

def total_waste(stock_size, solution):

'''
Calculates cutoff waste give a specific solution

inputs
stock_size (float) : the dimension of the stock
available for purchase
solution (list) : list of floats depicting the
order to make the cuts

output
cut_off_waste (float) : the total cutoff waste of
the given solution

'''

# set up variable to keep track of the total lengths of
# cuts on current piece of stock
temp_cut_length = 0

# start with no waste
cut_off_waste = 0

for cut in solution:

# if next cut doesn't fit on current stock,
# calculate new waste and reset for another piece of stock
if temp_cut_length + cut > stock_size:

# calculate cutoff waste
cut_off_waste += stock_size - temp_cut_length

# reset cut length
temp_cut_length = cut

else:
# add to cumulative length of cuts on stock
temp_cut_length += cut

# add in last cut waste -- it is not captured in the loop
cut_off_waste += stock_size - temp_cut_length

return cut_off_waste

We’ll cover the algorithm we need in the next section.

The greedy algorithm

The greedy algorithm is extremely simple, just find the biggest piece that fits on what is left of the current stock you are working on.

Using the previous example, we’ll say we want to make these cuts [4′, 2′, 1′, 2′, 2′, 1′] and we need a greedy algorithm to optimize the order.

We first start with the longest cut that fits on our current piece of stock. Since we haven’t made any cuts, our current stock piece is 5′. 4′ is the longest cut we have and it fits on a 5′ of remaining stock. The next longest cut is 2′, since we only have 1′ of stock left, it is too long. We go to the next longest cut, which is 1′. This fits in the remaining stock, so our next cut is 1′. The algorithm follows this pattern until there are no more cuts left.

Here is what that algorithm looks like in Python:

def greedy_search(cuts, stock_size):

'''
Calculates a greedy optimal solution

inputs:
cuts (list) : cuts that need to be made
stock_size (float) : size of stock available for purchase

outputs:
cut_plan (list) : sequence of cuts to obtain greedy
optimal results
waste (float) : amount of material wasted by solution

'''

# empty cut off plan, to be populated
cut_plan = []
# start with cutoff size equal to stock size
cut_off = stock_size
# copy cuts list to avoid modifying original list
cuts_copy = copy(cuts)

# sort cuts in desc order
cuts = list(np.sort(cuts)[::-1])

# continue ordering cuts until
# all cuts have been ordered
while len(cuts_copy) > 0:

for cut_size in cuts_copy:

# if cut size is smaller than remaining stock,
# assign the cut now
if cut_size < cut_off:

# add cut to plan
cut_plan.append(cut_size)

# update the leftover amount
cut_off -= cut_size

# remove cut from list of cuts still needing
# to be ordered
cuts_copy.remove(cut_size)

# reset cut_off to be the full stock size
cut_off = stock_size

# calculate waste using total_waste function
waste = total_waste(stock_size, cut_plan)

return cut_plan, waste

Comparison to exhaustive search in low n-space

We are saving ourselves a ton of time by using an approximation of the optimal solution via the greedy algorithm, but how good is that approximation? An exhaustive search of the solution space gives the global optimal solution — which is our gold standard. Let’s compare the greedy algorithm’s solution to the global optimal solutions for scenarios with just a few cuts (remember that to find the global optimal with a lot of cuts is really hard).

I randomly created 250 cut lists for cut sizes ranging from 2–10 cuts. Each cut was between 0 and 1 and the stock size was set to 1.1. Below is the performance:

Image by author

As you can see, as ’n’ increases, the greedy performance gets worse relative to global optimum, but stays relatively close and highly correlated.

Comparison to random search in higher n-space

Unfortunately, we have a major blind spot… That is, we don’t know the global optimum in higher n-space. Now, we get into the business of comparing different types of heuristics (algorithms designed to approximate global optimums). How does the greedy algorithm hold up against a random search of the solution space?

Image by author

We can see that, as the number of cuts gets larger, the random search gets much worse. This makes sense because the random search is randomly selecting 500 solutions and picking the best one — as the solution space explodes, the probability percentage of the solution space we randomly search gets smaller and smaller. We now can see that the greedy solution is much better than just randomly looking at potential solutions.

Conclusions

It seems that the greedy solution to the stock cutting problem is a reasonable, very fast way to find a pretty good solution. For my purposes, this is sufficient. I only do a few projects a year and they are typically small. However, for applications such as manufacturing plants, more complicated and intensive approaches will likely have a significant dollar impact to the company. You can look into ‘mixed integer linear programming’ (MILP) which is another way of searching for better optimal solutions.

If I were to go more into depth on this problem, I would compare the greedy approach to better meta heuristic algorithms (random search is probably the worst one ) such as various versions of hill-climb, tabu search or simulated annealing. For now I’ll leave it here however, I have another table to make!

For all of the code used in this article, see this repo.



Source link

Leave a Comment