diff --git a/0.2/.buildinfo b/0.2/.buildinfo
deleted file mode 100644
index 1e37ad2..0000000
--- a/0.2/.buildinfo
+++ /dev/null
@@ -1,4 +0,0 @@
-# Sphinx build info version 1
-# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
-config: 3772558d35a82a60dc512dba5e805386
-tags: d77d1c0d9ca2f4c8421862c7c5a0d620
diff --git a/0.2/_sources/benchmarks/facility.ipynb.txt b/0.2/_sources/benchmarks/facility.ipynb.txt
deleted file mode 100644
index 69f6aa4..0000000
--- a/0.2/_sources/benchmarks/facility.ipynb.txt
+++ /dev/null
@@ -1,29 +0,0 @@
-{
- "cells": [
- {
- "cell_type": "markdown",
- "id": "792bbfa2",
- "metadata": {},
- "source": [
- "# Facility Location\n",
- "\n",
- "TODO"
- ]
- }
- ],
- "metadata": {
- "kernelspec": {
- "display_name": "Julia 1.6.0",
- "language": "julia",
- "name": "julia-1.6"
- },
- "language_info": {
- "file_extension": ".jl",
- "mimetype": "application/julia",
- "name": "julia",
- "version": "1.6.0"
- }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}
diff --git a/0.2/_sources/benchmarks/knapsack.ipynb.txt b/0.2/_sources/benchmarks/knapsack.ipynb.txt
deleted file mode 100644
index 0836b6e..0000000
--- a/0.2/_sources/benchmarks/knapsack.ipynb.txt
+++ /dev/null
@@ -1,111 +0,0 @@
-{
- "cells": [
- {
- "cell_type": "markdown",
- "id": "d236e2a0",
- "metadata": {},
- "source": [
- "# Multidimensional Knapsack\n",
- "\n",
- "### Problem definition\n",
- "\n",
- "Given a set of $n$ items and $m$ types of resources (also called *knapsacks*), the problem is to find a subset of items that maximizes profit without consuming more resources than it is available. More precisely, the problem is:\n",
- "\n",
- "$$\n",
- "\\begin{align*}\n",
- " \\text{maximize}\n",
- " & \\sum_{j=1}^n p_j x_j\n",
- " \\\\\n",
- " \\text{subject to}\n",
- " & \\sum_{j=1}^n w_{ij} x_j \\leq b_i\n",
- " & \\forall i=1,\\ldots,m \\\\\n",
- " & x_j \\in \\{0,1\\}\n",
- " & \\forall j=1,\\ldots,n\n",
- "\\end{align*}\n",
- "$$\n",
- "\n",
- "### Random instance generator\n",
- "\n",
- "The class `MultiKnapsackGenerator` can be used to generate random instances of this problem. The number of items $n$ and knapsacks $m$ are sampled from the user-provided probability distributions `n` and `m`. The weights $w_{ij}$ are sampled independently from the provided distribution `w`. The capacity of knapsack $i$ is set to\n",
- "\n",
- "$$\n",
- " b_i = \\alpha_i \\sum_{j=1}^n w_{ij}\n",
- "$$\n",
- "\n",
- "where $\\alpha_i$, the tightness ratio, is sampled from the provided probability\n",
- "distribution `alpha`. To make the instances more challenging, the costs of the items\n",
- "are linearly correlated to their average weights. More specifically, the price of each\n",
- "item $j$ is set to:\n",
- "\n",
- "$$\n",
- " p_j = \\sum_{i=1}^m \\frac{w_{ij}}{m} + K u_j,\n",
- "$$\n",
- "\n",
- "where $K$, the correlation coefficient, and $u_j$, the correlation multiplier, are sampled\n",
- "from the provided probability distributions `K` and `u`.\n",
- "\n",
- "If `fix_w=True` is provided, then $w_{ij}$ are kept the same in all generated instances. This also implies that $n$ and $m$ are kept fixed. Although the prices and capacities are derived from $w_{ij}$, as long as `u` and `K` are not constants, the generated instances will still not be completely identical.\n",
- "\n",
- "\n",
- "If a probability distribution `w_jitter` is provided, then item weights will be set to $w_{ij} \\gamma_{ij}$ where $\\gamma_{ij}$ is sampled from `w_jitter`. When combined with `fix_w=True`, this argument may be used to generate instances where the weight of each item is roughly the same, but not exactly identical, across all instances. The prices of the items and the capacities of the knapsacks will be calculated as above, but using these perturbed weights instead.\n",
- "\n",
- "By default, all generated prices, weights and capacities are rounded to the nearest integer number. If `round=False` is provided, this rounding will be disabled.\n",
- "\n",
- "\n",
- "
\n",
- "References\n",
- "\n",
- "* **Freville, Arnaud, and Gérard Plateau.** *An efficient preprocessing procedure for the multidimensional 0–1 knapsack problem.* Discrete applied mathematics 49.1-3 (1994): 189-212.\n",
- "* **Fréville, Arnaud.** *The multidimensional 0–1 knapsack problem: An overview.* European Journal of Operational Research 155.1 (2004): 1-21.\n",
- "
\n",
- "Note\n",
- " \n",
- "In this tutorial, we use SCIP because it is more widely available than commercial MIP solvers. However, all the steps below should work for Gurobi, CPLEX or XPRESS, as long as you have a license for these solvers. The performance impact of MIPLearn may also change for different solvers.\n",
- "
\n",
- "\n",
- "
\n",
- "Warning\n",
- " \n",
- "MIPLearn is still in early development stage. If run into any bugs or issues, please submit a bug report in our GitHub repository. Comments, suggestions and pull requests are also very welcome!\n",
- " \n",
- "
\n"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "8b97258c",
- "metadata": {},
- "source": [
- "## Installation\n",
- "\n",
- "MIPLearn is available in two versions:\n",
- "\n",
- "- Python version, compatible with the Pyomo modeling language,\n",
- "- Julia version, compatible with the JuMP modeling language.\n",
- "\n",
- "In this tutorial, we will demonstrate how to use and install the Julia/JuMP version of the package. The first step is to install the Julia programming language in your computer. [See the official instructions for more details](https://julialang.org/downloads/). Note that MIPLearn was developed and tested with Julia 1.6, and may not be compatible with newer versions of the language. After Julia is installed, launch its console and run the following commands to download and install the package:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 1,
- "id": "2dbeacbc",
- "metadata": {},
- "outputs": [
- {
- "name": "stderr",
- "output_type": "stream",
- "text": [
- "\u001b[32m\u001b[1m Updating\u001b[22m\u001b[39m git-repo `https://github.com/ANL-CEEESA/MIPLearn.jl.git`\n",
- "\u001b[32m\u001b[1m Updating\u001b[22m\u001b[39m registry at `~/.julia/registries/General`\n",
- "\u001b[32m\u001b[1m Updating\u001b[22m\u001b[39m git-repo `https://github.com/JuliaRegistries/General.git`\n",
- "\u001b[32m\u001b[1m Resolving\u001b[22m\u001b[39m package versions...\n",
- "\u001b[32m\u001b[1m No Changes\u001b[22m\u001b[39m to `~/Packages/MIPLearn/dev/docs/jump-tutorials/Project.toml`\n",
- "\u001b[32m\u001b[1m No Changes\u001b[22m\u001b[39m to `~/Packages/MIPLearn/dev/docs/jump-tutorials/Manifest.toml`\n"
- ]
- }
- ],
- "source": [
- "using Pkg\n",
- "Pkg.add(PackageSpec(url=\"https://github.com/ANL-CEEESA/MIPLearn.jl.git\"))"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "b2f449e7",
- "metadata": {},
- "source": [
- "In addition to MIPLearn itself, we will also install a few other packages that are required for this tutorial:\n",
- "\n",
- "- [**SCIP**](https://www.scipopt.org/), one of the fastest non-commercial MIP solvers currently available\n",
- "- [**JuMP**](https://jump.dev/), an open source modeling language for Julia\n",
- "- [**Distributions.jl**](https://github.com/JuliaStats/Distributions.jl), a statistics package that we will use to generate random inputs\n",
- "- [**Glob.jl**](https://github.com/vtjnash/Glob.jl), a package that retrieves all files in a directory matching a certain pattern"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 2,
- "id": "68f99568",
- "metadata": {},
- "outputs": [
- {
- "name": "stderr",
- "output_type": "stream",
- "text": [
- "\u001b[32m\u001b[1m Resolving\u001b[22m\u001b[39m package versions...\n",
- "\u001b[32m\u001b[1m No Changes\u001b[22m\u001b[39m to `~/Packages/MIPLearn/dev/docs/jump-tutorials/Project.toml`\n",
- "\u001b[32m\u001b[1m No Changes\u001b[22m\u001b[39m to `~/Packages/MIPLearn/dev/docs/jump-tutorials/Manifest.toml`\n"
- ]
- }
- ],
- "source": [
- "using Pkg\n",
- "Pkg.add([\n",
- " PackageSpec(url=\"https://github.com/scipopt/SCIP.jl.git\", rev=\"7aa79aaa\"),\n",
- " PackageSpec(name=\"JuMP\", version=\"0.21\"),\n",
- " PackageSpec(name=\"Distributions\", version=\"0.25\"),\n",
- " PackageSpec(name=\"Glob\", version=\"1\"),\n",
- "])"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "51e09fc9",
- "metadata": {},
- "source": [
- "
\n",
- " \n",
- "Note\n",
- " \n",
- "In the code above, we install specific version of all packages to ensure that this tutorial keeps running in the future, even when newer (and possibly incompatible) versions of the packages are released. This is usually a recommended practice for all Julia projects.\n",
- " \n",
- "
"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "18c300c4",
- "metadata": {},
- "source": [
- "## Modeling a simple optimization problem\n",
- "\n",
- "To illustrate how can MIPLearn be used, we will model and solve a small optimization problem related to power systems optimization. The problem we discuss below is a simplification of the **unit commitment problem,** a practical optimization problem solved daily by electric grid operators around the world. \n",
- "\n",
- "Suppose that you work at a utility company, and that it is your job to decide which electrical generators should be online at a certain hour of the day, as well as how much power should each generator produce. More specifically, assume that your company owns $n$ generators, denoted by $g_1, \\ldots, g_n$. Each generator can either be online or offline. An online generator $g_i$ can produce between $p^\\text{min}_i$ to $p^\\text{max}_i$ megawatts of power, and it costs your company $c^\\text{fix}_i + c^\\text{var}_i y_i$, where $y_i$ is the amount of power produced. An offline generator produces nothing and costs nothing. You also know that the total amount of power to be produced needs to be exactly equal to the total demand $d$ (in megawatts). To minimize the costs to your company, which generators should be online, and how much power should they produce?\n",
- "\n",
- "This simple problem can be modeled as a *mixed-integer linear optimization* problem as follows. For each generator $g_i$, let $x_i \\in \\{0,1\\}$ be a decision variable indicating whether $g_i$ is online, and let $y_i \\geq 0$ be a decision variable indicating how much power does $g_i$ produce. The problem is then given by:\n",
- "\n",
- "$$\n",
- "\\begin{align}\n",
- "\\text{minimize } \\quad & \\sum_{i=1}^n \\left( c^\\text{fix}_i x_i + c^\\text{var}_i y_i \\right) \\\\\n",
- "\\text{subject to } \\quad & y_i \\leq p^\\text{max}_i x_i & i=1,\\ldots,n \\\\\n",
- "& y_i \\geq p^\\text{min}_i x_i & i=1,\\ldots,n \\\\\n",
- "& \\sum_{i=1}^n y_i = d \\\\\n",
- "& x_i \\in \\{0,1\\} & i=1,\\ldots,n \\\\\n",
- "& y_i \\geq 0 & i=1,\\ldots,n\n",
- "\\end{align}\n",
- "$$\n",
- "\n",
- "
\n",
- " \n",
- "Note\n",
- " \n",
- "We use a simplified version of the unit commitment problem in this tutorial just to make it easier to follow. MIPLearn can also handle realistic, large-scale versions of this problem. See benchmarks for more details.\n",
- " \n",
- "
\n",
- "\n",
- "Next, let us convert this abstract mathematical formulation into a concrete optimization model, using Julia and JuMP. We start by defining a data structure that holds all the input data."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 3,
- "id": "b12d6483",
- "metadata": {},
- "outputs": [],
- "source": [
- "Base.@kwdef struct UnitCommitmentData\n",
- " demand::Float64\n",
- " pmin::Vector{Float64}\n",
- " pmax::Vector{Float64}\n",
- " cfix::Vector{Float64}\n",
- " cvar::Vector{Float64}\n",
- "end;"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "55cdb64b",
- "metadata": {},
- "source": [
- "Next, we create a function that converts this data structure into a concrete JuMP model. For more details on the JuMP syntax, see [the official JuMP documentation](https://jump.dev/JuMP.jl/stable/)."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 4,
- "id": "1e38a266",
- "metadata": {},
- "outputs": [],
- "source": [
- "using JuMP\n",
- "\n",
- "function build_uc_model(data::UnitCommitmentData)::Model\n",
- " model = Model()\n",
- " n = length(data.pmin)\n",
- " @variable(model, x[1:n], Bin)\n",
- " @variable(model, y[1:n] >= 0)\n",
- " @objective(\n",
- " model,\n",
- " Min,\n",
- " sum(\n",
- " data.cfix[i] * x[i] +\n",
- " data.cvar[i] * y[i]\n",
- " for i in 1:n\n",
- " )\n",
- " )\n",
- " @constraint(model, eq_max_power[i in 1:n], y[i] <= data.pmax[i] * x[i])\n",
- " @constraint(model, eq_min_power[i in 1:n], y[i] >= data.pmin[i] * x[i])\n",
- " @constraint(model, eq_demand, sum(y[i] for i in 1:n) == data.demand)\n",
- " return model\n",
- "end;"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "d28c4d5a",
- "metadata": {},
- "source": [
- "At this point, we can already use JuMP and any mixed-integer linear programming solver to find optimal solutions to any instance of this problem. To illustrate this, let us solve a small instance with three generators, using SCIP:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 5,
- "id": "9ff9f05c",
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "obj = 1320.0\n",
- " x = [0.0, 1.0, 1.0]\n",
- " y = [0.0, 60.0, 40.0]\n"
- ]
- }
- ],
- "source": [
- "using SCIP\n",
- "\n",
- "model = build_uc_model(\n",
- " UnitCommitmentData(\n",
- " demand = 100.0,\n",
- " pmin = [10, 20, 30],\n",
- " pmax = [50, 60, 70],\n",
- " cfix = [700, 600, 500],\n",
- " cvar = [1.5, 2.0, 2.5],\n",
- " )\n",
- ")\n",
- "\n",
- "scip = optimizer_with_attributes(SCIP.Optimizer, \"limits/gap\" => 1e-4)\n",
- "set_optimizer(model, scip)\n",
- "set_silent(model)\n",
- "optimize!(model)\n",
- "\n",
- "println(\"obj = \", objective_value(model))\n",
- "println(\" x = \", round.(value.(model[:x])))\n",
- "println(\" y = \", round.(value.(model[:y]), digits=2));"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "345de591",
- "metadata": {},
- "source": [
- "Running the code above, we found that the optimal solution for our small problem instance costs \\$1320. It is achieve by keeping generators 2 and 3 online and producing, respectively, 60 MW and 40 MW of power."
- ]
- },
- {
- "cell_type": "markdown",
- "id": "eb8904ef",
- "metadata": {},
- "source": [
- "## Generating training data\n",
- "\n",
- "Although SCIP could solve the small example above in a fraction of a second, it gets slower for larger and more complex versions of the problem. If this is a problem that needs to be solved frequently, as it is often the case in practice, it could make sense to spend some time upfront generating a **trained** version of SCIP, which can solve new instances (similar to the ones it was trained on) faster.\n",
- "\n",
- "In the following, we will use MIPLearn to train machine learning models that can be used to accelerate SCIP's performance on a particular set of instances. More specifically, MIPLearn will train a model that is able to predict the optimal solution for instances that follow a given probability distribution, then it will provide this predicted solution to SCIP as a warm start.\n",
- "\n",
- "Before we can train the model, we need to collect training data by solving a large number of instances. In real-world situations, we may construct these training instances based on historical data. In this tutorial, we will construct them using a random instance generator:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 6,
- "id": "7298bb0d",
- "metadata": {},
- "outputs": [],
- "source": [
- "using Distributions\n",
- "using Random\n",
- "\n",
- "function random_uc_data(; samples::Int, n::Int, seed=42)\n",
- " Random.seed!(seed)\n",
- " pmin = rand(Uniform(100, 500.0), n)\n",
- " pmax = pmin .* rand(Uniform(2.0, 2.5), n)\n",
- " cfix = pmin .* rand(Uniform(100.0, 125.0), n)\n",
- " cvar = rand(Uniform(1.25, 1.5), n)\n",
- " return [\n",
- " UnitCommitmentData(;\n",
- " pmin,\n",
- " pmax,\n",
- " cfix,\n",
- " cvar,\n",
- " demand = sum(pmax) * rand(Uniform(0.5, 0.75)),\n",
- " )\n",
- " for i in 1:samples\n",
- " ]\n",
- "end;"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "c1feed43",
- "metadata": {},
- "source": [
- "In this example, for simplicity, only the demands change from one instance to the next. We could also have made the prices and the production limits random. The more randomization we have in the training data, however, the more challenging it is for the machine learning models to learn solution patterns.\n",
- "\n",
- "Now we generate 100 instances of this problem, each one with 1,000 generators. We will use the first 90 instances for training, and the remaining 10 instances to evaluate SCIP's performance."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 7,
- "id": "61d43994",
- "metadata": {},
- "outputs": [],
- "source": [
- "data = random_uc_data(samples=100, n=1000);\n",
- "train_data = data[1:90]\n",
- "test_data = data[91:100];"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "3fdeb8cd",
- "metadata": {},
- "source": [
- "Next, we write these data structures to individual files. MIPLearn uses files during the training process because, for large-scale optimization problems, it is often impractical to hold the entire training data, as well as the concrete JuMP models, in memory. Files also make it much easier to solve multiple instances simultaneously, potentially even on multiple machines. We will cover parallel and distributed computing in a future tutorial.\n",
- "\n",
- "The code below generates the files `uc/train/000001.jld2`, `uc/train/000002.jld2`, etc., which contain the input data in [JLD2 format](https://github.com/JuliaIO/JLD2.jl)."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 8,
- "id": "31b48701",
- "metadata": {},
- "outputs": [],
- "source": [
- "using MIPLearn\n",
- "MIPLearn.save(data[1:90], \"uc/train/\")\n",
- "MIPLearn.save(data[91:100], \"uc/test/\")\n",
- "\n",
- "using Glob\n",
- "train_files = glob(\"uc/train/*.jld2\")\n",
- "test_files = glob(\"uc/test/*.jld2\");"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "5cecea59",
- "metadata": {},
- "source": [
- "Finally, we use `MIPLearn.LearningSolver` and `MIPLearn.solve!` to solve all the training instances. `LearningSolver` is the main component provided by MIPLearn, which integrates MIP solvers and ML. The `solve!` function can be used to solve either one or multiple instances, and requires: (i) the list of files containing the training data; and (ii) the function that converts the data structure into a concrete JuMP model:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 9,
- "id": "60732af0",
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "103.808547 seconds (93.52 M allocations: 3.604 GiB, 1.19% gc time, 0.52% compilation time)\n"
- ]
- },
- {
- "name": "stderr",
- "output_type": "stream",
- "text": [
- "WARNING: Dual bound 1.98665e+07 is larger than the objective of the primal solution 1.98665e+07. The solution might not be optimal.\n"
- ]
- }
- ],
- "source": [
- "using Glob\n",
- "solver = LearningSolver(scip)\n",
- "@time solve!(solver, train_files, build_uc_model);"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "bbc7ad82",
- "metadata": {},
- "source": [
- "The macro `@time` shows us how long did the code take to run. We can see that SCIP was able to solve all training instances in about 2 minutes. The solutions, and other useful training data, are stored by MIPLearn in `.h5` files, stored side-by-side with the original `.jld2` files."
- ]
- },
- {
- "cell_type": "markdown",
- "id": "73379180",
- "metadata": {},
- "source": [
- "## Solving new instances\n",
- "\n",
- "With training data in hand, we can now fit the ML models using `MIPLearn.fit!`, then solve the test instances with `MIPLearn.solve!`, as shown below:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 10,
- "id": "e045d644",
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- " 5.951264 seconds (9.33 M allocations: 334.657 MiB, 1.51% gc time)\n"
- ]
- }
- ],
- "source": [
- "solver_ml = LearningSolver(scip)\n",
- "fit!(solver_ml, train_files, build_uc_model)\n",
- "@time solve!(solver_ml, test_files, build_uc_model);"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "d8de7b26",
- "metadata": {},
- "source": [
- "The trained MIP solver was able to solve all test instances in about 6 seconds. To see that ML is being helpful here, let us repeat the code above, but remove the `fit!` line:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 11,
- "id": "cf2a989e",
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- " 10.390325 seconds (8.17 M allocations: 278.042 MiB, 0.89% gc time)\n"
- ]
- }
- ],
- "source": [
- "solver_baseline = LearningSolver(scip)\n",
- "@time solve!(solver_baseline, test_files, build_uc_model);"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "e100b25d",
- "metadata": {},
- "source": [
- "Without the help of the ML models, SCIP took around 10 seconds to solve the same test instances.\n",
- "\n",
- "
\n",
- "Note\n",
- " \n",
- "Note that is is not necessary to specify what ML models to use. MIPLearn, by default, will try a number of classical ML models and will choose the one that performs the best, based on k-fold cross validation. MIPLearn is also able to automatically collect features based on the MIP formulation of the problem and the solution to the LP relaxation, among other things, so it does not require handcrafted features. If you do want to customize the models and features, however, that is also possible, as we will see in a later tutorial.\n",
- "