# Getting started

## Introduction

**MIPLearn** is an open source framework that uses machine learning (ML) to accelerate the performance of both commercial and open source mixed-integer programming solvers (e.g. Gurobi, CPLEX, XPRESS, Cbc or SCIP). In this tutorial, we will:

1. Install the Julia/JuMP version of MIPLearn
2. Model a simple optimization problem using JuMP
3. Generate training data and train the ML models
4. Use the ML models together with Gurobi to solve new instances

<div class="alert alert-warning">
Warning
    
MIPLearn is still in early development stage. If run into any bugs or issues, please submit a bug report in our GitHub repository. Comments, suggestions and pull requests are also very welcome!
    
</div>


## Installation

MIPLearn is available in two versions:

- Python version, compatible with the Pyomo modeling language,
- Julia version, compatible with the JuMP modeling language.

In this tutorial, we will demonstrate how to use and install the Julia/JuMP version of the package. The first step is to install the Julia programming language in your computer. [See the official instructions for more details](https://julialang.org/downloads/). Note that MIPLearn was developed and tested with Julia 1.6, and may not be compatible with newer versions of the language. After Julia is installed, launch its console and run the following commands to download and install the package:

In [1]:
using Pkg
#Pkg.add(PackageSpec(url="https://github.com/ANL-CEEESA/MIPLearn.jl.git"))
Pkg.develop(PackageSpec(path="/home/axavier/Packages/MIPLearn.jl/dev"))

Path `/home/axavier/Packages/MIPLearn.jl/dev` exists and looks like the correct package. Using existing path.
[32m[1m   Resolving[22m[39m package versions...
[32m[1m  No Changes[22m[39m to `~/Packages/MIPLearn/dev/docs/jump-tutorials/Project.toml`
[32m[1m  No Changes[22m[39m to `~/Packages/MIPLearn/dev/docs/jump-tutorials/Manifest.toml`


In addition to MIPLearn itself, we will also install a few other packages that are required for this tutorial:

- [**Gurobi**](https://www.gurobi.com/), a state-of-the-art MIP solver
- [**JuMP**](https://jump.dev/), an open source modeling language for Julia
- [**Distributions.jl**](https://github.com/JuliaStats/Distributions.jl), a statistics package that we will use to generate random inputs

In [2]:
using Pkg
Pkg.add([
    PackageSpec(name="Gurobi", version="0.9.14"),
    PackageSpec(name="JuMP", version="0.21"),
    PackageSpec(name="Distributions", version="0.25"),
    PackageSpec(name="Glob", version="1"),
])

[32m[1m    Updating[22m[39m registry at `~/.julia/registries/General`
[32m[1m    Updating[22m[39m git-repo `https://github.com/JuliaRegistries/General.git`
[32m[1m   Resolving[22m[39m package versions...
[32m[1m  No Changes[22m[39m to `~/Packages/MIPLearn/dev/docs/jump-tutorials/Project.toml`
[32m[1m  No Changes[22m[39m to `~/Packages/MIPLearn/dev/docs/jump-tutorials/Manifest.toml`


<div class="alert alert-info">
    
Note
    
In the code above, we install specific version of all packages to ensure that this tutorial keeps running in the future, even when newer (and possibly incompatible) versions of the packages are released. This is usually a recommended practice for all Julia projects.
    
</div>

## Modeling a simple optimization problem

To illustrate how can MIPLearn be used, we will model and solve a small optimization problem related to power systems optimization. The problem we discuss below is a simplification of the **unit commitment problem,** a practical optimization problem solved daily by electric grid operators around the world. 

Suppose that you work at a utility company, and that it is your job to decide which electrical generators should be online at a certain hour of the day, as well as how much power should each generator produce. More specifically, assume that your company owns $n$ generators, denoted by $g_1, \ldots, g_n$. Each generator can either be online or offline. An online generator $g_i$ can produce between $p^\text{min}_i$ to $p^\text{max}_i$ megawatts of power, and it costs your company $c^\text{fix}_i + c^\text{var}_i y_i$, where $y_i$ is the amount of power produced. An offline generator produces nothing and costs nothing. You also know that the total amount of power to be produced needs to be exactly equal to the total demand $d$ (in megawatts). To minimize the costs to your company, which generators should be online, and how much power should they produce?

This simple problem can be modeled as a *mixed-integer linear optimization* problem as follows. For each generator $g_i$, let $x_i \in \{0,1\}$ be a decision variable indicating whether $g_i$ is online, and let $y_i \geq 0$ be a decision variable indicating how much power does $g_i$ produce. The problem is then given by:

$$
\begin{align}
\text{minimize } \quad & \sum_{i=1}^n \left( c^\text{fix}_i x_i + c^\text{var}_i y_i \right) \\
\text{subject to } \quad & y_i \leq p^\text{max}_i x_i & i=1,\ldots,n \\
& y_i \geq p^\text{min}_i x_i & i=1,\ldots,n \\
& \sum_{i=1}^n y_i = d \\
& x_i \in \{0,1\} & i=1,\ldots,n \\
& y_i \geq 0 & i=1,\ldots,n
\end{align}
$$

<div class="alert alert-info">
    
Note
    
We use a simplified version of the unit commitment problem in this tutorial just to make it easier to follow. MIPLearn can also handle realistic, large-scale versions of this problem. See benchmarks for more details.
    
</div>

Next, let us convert this abstract mathematical formulation into a concrete optimization model, using Julia and JuMP. We start by defining a data structure that holds all the input data.

In [3]:
Base.@kwdef struct UnitCommitmentData
    demand::Float64
    pmin::Vector{Float64}
    pmax::Vector{Float64}
    cfix::Vector{Float64}
    cvar::Vector{Float64}
end;

Next, we create a function that converts this data structure into a concrete JuMP model. For more details on the JuMP syntax, see [the official JuMP documentation](https://jump.dev/JuMP.jl/stable/).

In [4]:
using JuMP

function build_uc_model(data::UnitCommitmentData)::Model
    model = Model()
    n = length(data.pmin)
    @variable(model, x[1:n], Bin)
    @variable(model, y[1:n] >= 0)
    @objective(
        model,
        Min,
        sum(
            data.cfix[i] * x[i] +
            data.cvar[i] * y[i]
            for i in 1:n
        )
    )
    @constraint(model, eq_max_power[i in 1:n], y[i] <= data.pmax[i] * x[i])
    @constraint(model, eq_min_power[i in 1:n], y[i] >= data.pmin[i] * x[i])
    @constraint(model, eq_demand, sum(y[i] for i in 1:n) == data.demand)
    return model
end;

At this point, we can already use JuMP and any mixed-integer linear programming solver to find optimal solutions to any instance of this problem. To illustrate this, let us solve a small instance with three generators, using SCIP:

In [5]:
using Gurobi

model = build_uc_model(
    UnitCommitmentData(
        demand = 100.0,
        pmin = [10, 20, 30],
        pmax = [50, 60, 70],
        cfix = [700, 600, 500],
        cvar = [1.5, 2.0, 2.5],
    )
)

gurobi = optimizer_with_attributes(Gurobi.Optimizer, "Threads" => 1, "Seed" => 42)
set_optimizer(model, gurobi)
set_silent(model)
optimize!(model)

println("obj = ", objective_value(model))
println("  x = ", round.(value.(model[:x])))
println("  y = ", round.(value.(model[:y]), digits=2));

obj = 1320.0
  x = [0.0, 1.0, 1.0]
  y = [0.0, 60.0, 40.0]


Running the code above, we found that the optimal solution for our small problem instance costs \$1320. It is achieve by keeping generators 2 and 3 online and producing, respectively, 60 MW and 40 MW of power.

## Generating training data

Although SCIP could solve the small example above in a fraction of a second, it gets slower for larger and more complex versions of the problem. If this is a problem that needs to be solved frequently, as it is often the case in practice, it could make sense to spend some time upfront generating a **trained** version of SCIP, which can solve new instances (similar to the ones it was trained on) faster.

In the following, we will use MIPLearn to train machine learning models that can be used to accelerate SCIP's performance on a particular set of instances. More specifically, MIPLearn will train a model that is able to predict the optimal solution for instances that follow a given probability distribution, then it will provide this predicted solution to SCIP as a warm start.

Before we can train the model, we need to collect training data by solving a large number of instances. In real-world situations, we may construct these training instances based on historical data. In this tutorial, we will construct them using a random instance generator:

In [6]:
using Distributions
using Random

function random_uc_data(; samples::Int, n::Int, seed=42)
    Random.seed!(seed)
    pmin = rand(Uniform(100, 500.0), n)
    pmax = pmin .* rand(Uniform(2.0, 2.5), n)
    cfix = pmin .* rand(Uniform(100.0, 125.0), n)
    cvar = rand(Uniform(1.25, 1.5), n)
    return [
        UnitCommitmentData(;
            pmin,
            pmax,
            cfix,
            cvar,
            demand = sum(pmax) * rand(Uniform(0.5, 0.75)),
        )
        for i in 1:samples
    ]
end;

In this example, for simplicity, only the demands change from one instance to the next. We could also have randomized the costs, production limits or even the number of units. The more randomization we have in the training data, however, the more challenging it is for the machine learning models to learn solution patterns.

Now we generate 500 instances of this problem, each one with 50 generators, and we use 450 of these instances for training. After generating the instances, we write them to individual files. MIPLearn uses files during the training process because, for large-scale optimization problems, it is often impractical to hold in memory the entire training data, as well as the concrete JuMP models. Files also make it much easier to solve multiple instances simultaneously, potentially even on multiple machines. We will cover parallel and distributed computing in a future tutorial. The code below generates the files `uc/train/00001.jld2`, `uc/train/00002.jld2`, etc., which contain the input data in [JLD2 format](https://github.com/JuliaIO/JLD2.jl).

In [7]:
using MIPLearn
data = random_uc_data(samples=500, n=50);
train_files = MIPLearn.save(data[1:450], "uc/train/")
test_files  = MIPLearn.save(data[451:500], "uc/test/");

Finally, we use `LearningSolver` to solve all the training instances. `LearningSolver` is the main component provided by MIPLearn, which integrates MIP solvers and ML. The optimal solutions, along with other useful training data, are stored in HDF5 files `uc/train/00001.h5`, `uc/train/00002.h5`, etc.

In [8]:
solver = LearningSolver(gurobi)
solve!(solver, train_files, build_uc_model);

## Solving new instances

With training data in hand, we can now fit the ML models using `MIPLearn.fit!`, then solve the test instances with `MIPLearn.solve!`, as shown below. The `tee=true` parameter asks MIPLearn to print the solver log to the screen.

In [9]:
solver_ml = LearningSolver(gurobi)
fit!(solver_ml, train_files, build_uc_model)
solve!(solver_ml, test_files[1], build_uc_model, tee=true);

Gurobi Optimizer version 9.1.1 build v9.1.1rc0 (linux64)
Thread count: 16 physical cores, 32 logical processors, using up to 1 threads
Optimize a model with 101 rows, 100 columns and 250 nonzeros
Model fingerprint: 0xfb382c05
Coefficient statistics:
  Matrix range     [1e+00, 1e+03]
  Objective range  [1e+00, 6e+04]
  Bounds range     [1e+00, 1e+00]
  RHS range        [2e+04, 2e+04]
Presolve removed 100 rows and 50 columns
Presolve time: 0.00s
Presolved: 1 rows, 50 columns, 50 nonzeros

Iteration    Objective       Primal Inf.    Dual Inf.      Time
       0    7.0629410e+05   6.782322e+02   0.000000e+00      0s
       1    8.0678161e+05   0.000000e+00   0.000000e+00      0s

Solved in 1 iterations and 0.00 seconds
Optimal objective  8.067816095e+05

User-callback calls 33, time in user-callback 0.00 sec

Gurobi Optimizer version 9.1.1 build v9.1.1rc0 (linux64)
Thread count: 16 physical cores, 32 logical processors, using up to 1 threads
Optimize a model with 101 rows, 100 columns and 

By examining the solve log above, specifically the line `Loaded user MIP start with objective...`, we can see that MIPLearn was able to construct an initial solution which turned out to be near optimal for the problem. Now let us repeat the code above, but using an untrained solver. Note that the `fit` line is omitted.

In [10]:
solver_baseline = LearningSolver(gurobi)
solve!(solver_baseline, test_files[1], build_uc_model, tee=true);

Gurobi Optimizer version 9.1.1 build v9.1.1rc0 (linux64)
Thread count: 16 physical cores, 32 logical processors, using up to 1 threads
Optimize a model with 101 rows, 100 columns and 250 nonzeros
Model fingerprint: 0xfb382c05
Coefficient statistics:
  Matrix range     [1e+00, 1e+03]
  Objective range  [1e+00, 6e+04]
  Bounds range     [1e+00, 1e+00]
  RHS range        [2e+04, 2e+04]
Presolve removed 100 rows and 50 columns
Presolve time: 0.00s
Presolved: 1 rows, 50 columns, 50 nonzeros

Iteration    Objective       Primal Inf.    Dual Inf.      Time
       0    7.0629410e+05   6.782322e+02   0.000000e+00      0s
       1    8.0678161e+05   0.000000e+00   0.000000e+00      0s

Solved in 1 iterations and 0.00 seconds
Optimal objective  8.067816095e+05

User-callback calls 33, time in user-callback 0.00 sec

Gurobi Optimizer version 9.1.1 build v9.1.1rc0 (linux64)
Thread count: 16 physical cores, 32 logical processors, using up to 1 threads
Optimize a model with 101 rows, 100 columns and 

In the log above, the `MIP start` line is missing, and Gurobi had to start with a significantly inferior initial solution. The solver was still able to find the optimal solution at the end, but it required using its own internal heuristic procedures. In this example, because we solve very small optimization problems, there was almost no difference in terms of running time. For larger problems, however, the difference can be significant. See benchmarks for more details.

<div class="alert alert-info">
Note
    
In addition to partial initial solutions, MIPLearn is also able to predict lazy constraints, cutting planes and branching priorities. See the next tutorials for more details.
</div>

<div class="alert alert-info">
Note
    
It is not necessary to specify what ML models to use. MIPLearn, by default, will try a number of classical ML models and will choose the one that performs the best, based on k-fold cross validation. MIPLearn is also able to automatically collect features based on the MIP formulation of the problem and the solution to the LP relaxation, among other things, so it does not require handcrafted features. If you do want to customize the models and features, however, that is also possible, as we will see in a later tutorial.
</div>

## Accessing the solution

In the example above, we used `MIPLearn.solve` together with data files to solve both the training and the test instances. The optimal solutions were saved to HDF5 files in the train/test folders, and could be retrieved by reading theses files, but that is not very convenient. In the following example, we show how to build and solve a JuMP model entirely in-memory, using our trained solver.

In [11]:
# Construct model using previously defined functions
data = random_uc_data(samples=1, n=50)[1]
model = build_uc_model(data)

# Solve model
solve!(solver_ml, model)

# Print part of the optimal solution
println("obj = ", objective_value(model))
println("  x = ", round.(value.(model[:x][1:10])))
println("  y = ", round.(value.(model[:y][1:10]), digits=2))

obj = 809710.340270503
  x = [1.0, -0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0]
  y = [696.38, 0.0, 249.05, 0.0, 1183.75, 0.0, 504.91, 387.32, 1178.0, 765.25]
