Branch & Bound problem in Ruby

on May 5, 2013

Problem definition

I want to find the set of widgets out of a list of widgets, that best fits a set of requirements. This is a problem of my own design that model’s a larger problem I am having for a project.

Specifics:

Widget has numeric qualities that correspond to the requirement categories
There is a single paramount value for each widget (and a corresponding requirement), which cannot be exceeded.
- In the example code it is referred to as size
- This limits the ultimate size of the set
Widgets may exist more than once in the set or not at all
The optimal solution will fill all requirements, in the least amount of size, and “over-fill” the requirements the least amount possible.

For example:

Requirement Size: No more than 90 Requirement A: Need 250 Requirement B: Need 200 Requirement C: Need 100

Widgets (where first number is size and the following are the values that fill requirement A & B respectively): Widget X: 15 | 90/ 20/ 80 Widget Y: 10 | 70/ 50/ 10 Widget Z: 20 | 100/ 150/ 50

{X, Y, Z} is the best solution according to my brute enumeration of all possibilities(83) and testing of their score. My brute Ruby code arrives at this bench mark

                user     system      total        real
brute[90]:    0.010000   0.000000   0.010000 (  0.002249)

However, if we expand the search area, by changing the requirement “size” to 400 we get the following benchmark:

                  user     system      total        real
brute[400]:    0.280000   0.000000   0.280000 (  0.283165)

The solution {X, Y, Z} is the same, but it has taken 100x as long to arrive at because the possible sets is now 4262. I hope to reduce this additional processing time that comes with set expansion by using the branch and bound algorithm.

Algorithm

I believe that this problem is known as integer linear programming. I will attempt to solve it with a Branch and Bound algorithm.

I see two features of the above problem that differentiate it from the vanilla implementations I have seen:

The usage of more than one widget in a set
The size requirement for set bounds (most problems seem to be positional or based on the number of elements)

Combined, these two issues make the solutions a bit more flexible and perhaps a bit more difficult to evaluate how fit a particular branch may be.

Let’s define what we need and what we have for the algorithm:

Root node: I have seen examples where this is a) an empty set b) a set filled with an arbitrary example. I’ll pick the former.
Branching: We will branch on widget combinations.
Scoring: In order to evaluate the fitness of a particular set or branch. You need a way to evaluate score. The best score for my problem would be 0. Which would mean all requirements are satisfied, and none are “over satisfied”. For my problem. I score according to this hierarchy 0 > negative values > positive values. Positive values indicate requirements are not satisfied, negatives have at least one over satisfied requirement.
Bounding: To reduce the amount of tested sets, we can bound on the following conditions:
- If size requirement is filled, the set is full and no more branching
- If all requirements are filled but the score is worse than the best possible solution, the branch is dead (adding more widgets will make the score worse).

Implementation

Start with an empty set.
Get branches for all available widgets (widgets that fit into the array).
Evaluate the score of all branches.
Arbitrarily set the “best” solution to the first array with first widget.
Prune branches.
Select the branch with the current best score
Expand that branch, by expanding with all available widgets.
Continue until you reach a terminal point (see bounding).
Return one level and repeat 5-8.
Repeat until all branching has been exhausted.

Results

My first pass at the branch & bound function is found here. I see several ways to optimize the solution and make the branch selection and pruning more efficient. Also, I could avoid so much score recalculation and just figure out what the single widget addition will do to each candidate branch. However, I was mostly trying to focus on a clear and easy implementation.

In the small size example the results were slightly slower. The reason being that the implementation does all permutations rather than combinations:

             user     system      total        real
b&b[90]:  0.000000   0.000000   0.000000 (  0.003899)

However, when we increase the sample size, we see a significant performance improvement over the brute above:

              user     system      total        real
b&b[400]:  0.010000   0.000000   0.010000 (  0.005912)