From d20c41704dfa436e6e70a539042678a3f1b52a83 Mon Sep 17 00:00:00 2001
From: Alinson S Xavier <axavier@anl.gov>
Date: Thu, 27 May 2021 17:20:00 -0500
Subject: [PATCH] Update docs

---
 docs/benchmark.md     | 177 ----------------------------------------
 docs/customization.md | 182 ------------------------------------------
 2 files changed, 359 deletions(-)
 delete mode 100644 docs/benchmark.md
 delete mode 100644 docs/customization.md

diff --git a/docs/benchmark.md b/docs/benchmark.md
deleted file mode 100644
index 88b6db5..0000000
--- a/docs/benchmark.md
+++ /dev/null
@@ -1,177 +0,0 @@
-```{sectnum}
----
-start: 2
-depth: 2
-suffix: .
----
-```
-
-# Benchmarks
-
-MIPLearn provides a selection of benchmark problems and random instance generators, covering applications from different fields, that can be used to evaluate new learning-enhanced MIP techniques in a measurable and reproducible way. In this page, we describe these problems, the included instance generators, and we present some benchmark results for  `LearningSolver` with default parameters.
-
-## Preliminaries
-
-### Benchmark challenges
-
-When evaluating the performance of a conventional MIP solver, *benchmark sets*, such as MIPLIB and TSPLIB, are typically used. The performance of newly proposed solvers or solution techniques are typically measured as the average (or total) running time the solver takes to solve the entire benchmark set. For Learning-Enhanced MIP solvers, it is also necessary to specify what instances should the solver be trained on (the *training instances*) before solving the actual set of instances we are interested in (the *test instances*). If the training instances are very similar to the test instances, we would expect a Learning-Enhanced Solver to present stronger perfomance benefits.
-
-In MIPLearn, each optimization problem comes with a set of **benchmark challenges**, which specify how should the training and test instances be generated. The first challenges are typically easier, in the sense that training and test instances are very similar. Later challenges gradually make the sets more distinct, and therefore harder to learn from.
-
-### Baseline results
-
-To illustrate the performance of `LearningSolver`, and to set a baseline for newly proposed techniques, we present in this page, for each benchmark challenge, a small set of computational results measuring the solution speed of the solver and the solution quality with default parameters. For more detailed computational studies, see [references](about.md#references). We compare three solvers:
-
-* **baseline:** Gurobi 9.0 with default settings (a conventional state-of-the-art MIP solver)
-* **ml-exact:** `LearningSolver` with default settings, using Gurobi 9.0 as internal MIP solver
-* **ml-heuristic:** Same as above, but with `mode="heuristic"`
-
-All experiments presented here were performed on a Linux server (Ubuntu Linux 18.04 LTS) with Intel Xeon Gold 6230s (2 processors, 40 cores, 80 threads) and 256 GB RAM (DDR4, 2933 MHz). All solvers were restricted to use 4 threads, with no time limits, and 10 instances were solved simultaneously at a time.
-
-
-
-## Maximum Weight Stable Set Problem
-
-### Problem definition
-
-Given a simple undirected graph $G=(V,E)$ and weights $w \in \mathbb{R}^V$, the problem is to find a stable set $S \subseteq V$ that maximizes $ \sum_{v \in V} w_v$. We recall that a subset $S \subseteq V$ is a *stable set* if no two vertices of $S$ are adjacent. This is one of Karp's 21 NP-complete problems.
-
-### Random instance generator
-
-The class `MaxWeightStableSetGenerator` can be used to generate random instances of this problem, with user-specified probability distributions. When the constructor parameter `fix_graph=True` is provided, one random Erdős-Rényi graph $G_{n,p}$ is generated during the constructor, where $n$ and $p$ are sampled from user-provided probability distributions `n` and `p`. To generate each instance, the generator independently samples each $w_v$ from the user-provided probability distribution `w`. When `fix_graph=False`, a new random graph is generated for each instance, while the remaining parameters are sampled in the same way.
-
-### Challenge A
-
-* Fixed random Erdős-Rényi graph $G_{n,p}$ with $n=200$ and $p=5\%$
-* Random vertex weights $w_v \sim U(100, 150)$
-* 500 training instances, 50 test instances
-
-```python
-MaxWeightStableSetGenerator(w=uniform(loc=100., scale=50.),
-                            n=randint(low=200, high=201),
-                            p=uniform(loc=0.05, scale=0.0),
-                            fix_graph=True)
-```
-
-![alt](figures/benchmark_stab_a.png)
-
-
-## Traveling Salesman Problem
-
-### Problem definition
-
-Given a list of cities and the distance between each pair of cities, the problem asks for the
-shortest route starting at the first city, visiting each other city exactly once, then returning
-to the first city. This problem is a generalization of the Hamiltonian path problem, one of Karp's
-21 NP-complete problems.
-
-### Random problem generator
-
-The class `TravelingSalesmanGenerator` can be used to generate random instances of this
-problem. Initially, the generator creates $n$ cities $(x_1,y_1),\ldots,(x_n,y_n) \in \mathbb{R}^2$,
-where $n, x_i$ and $y_i$ are sampled independently from the provided probability distributions `n`,
-`x` and `y`. For each pair of cities $(i,j)$, the distance $d_{i,j}$ between them is set to:
-$$
-    d_{i,j} = \gamma_{i,j} \sqrt{(x_i-x_j)^2 + (y_i - y_j)^2}
-$$
-where $\gamma_{i,j}$ is sampled from the distribution `gamma`.
-
-If `fix_cities=True` is provided, the list of cities is kept the same for all generated instances.
-The $gamma$ values, and therefore also the distances, are still different.
-
-By default, all distances $d_{i,j}$ are rounded to the nearest integer.  If `round=False`
-is provided, this rounding will be disabled.
-
-### Challenge A
-
-* Fixed list of 350 cities in the $[0, 1000]^2$ square
-* $\gamma_{i,j} \sim U(0.95, 1.05)$
-* 500 training instances, 50 test instances
-
-
-```python
-TravelingSalesmanGenerator(x=uniform(loc=0.0, scale=1000.0),
-                           y=uniform(loc=0.0, scale=1000.0),
-                           n=randint(low=350, high=351),
-                           gamma=uniform(loc=0.95, scale=0.1),
-                           fix_cities=True,
-                           round=True,
-                          )
-```
-
-![alt](figures/benchmark_tsp_a.png)
-
-
-## Multidimensional 0-1 Knapsack Problem
-
-### Problem definition
-
-Given a set of $n$ items and $m$ types of resources (also called *knapsacks*), the problem is to find a subset of items that maximizes profit without consuming more resources than it is available. More precisely, the problem is:
-
-$$
-\begin{align*}
-    \text{maximize}
-        & \sum_{j=1}^n p_j x_j
-        \\
-    \text{subject to}
-        & \sum_{j=1}^n w_{ij} x_j \leq b_i
-        & \forall i=1,\ldots,m \\
-    & x_j \in \{0,1\}
-        & \forall j=1,\ldots,n
-\end{align*}
-$$
-
-### Random instance generator
-
-The class `MultiKnapsackGenerator` can be used to generate random instances of this problem. The number of items $n$ and knapsacks $m$ are sampled from the user-provided probability distributions `n` and `m`. The weights $w_{ij}$ are sampled independently from the provided distribution `w`. The capacity of knapsack $i$ is set to
-
-$$
-    b_i = \alpha_i \sum_{j=1}^n w_{ij}
-$$
-
-where $\alpha_i$, the tightness ratio, is sampled from the provided probability
-distribution `alpha`. To make the instances more challenging, the costs of the items
-are linearly correlated to their average weights. More specifically, the price of each
-item $j$ is set to:
-
-$$
-    p_j = \sum_{i=1}^m \frac{w_{ij}}{m} + K  u_j,
-$$
-
-where $K$, the correlation coefficient, and $u_j$, the correlation multiplier, are sampled
-from the provided probability distributions `K` and `u`.
-
-If `fix_w=True` is provided, then $w_{ij}$ are kept the same in all generated instances. This also implies that $n$ and $m$ are kept fixed. Although the prices and capacities are derived from $w_{ij}$, as long as `u` and `K` are not constants, the generated instances will still not be completely identical.
-
-
-If a probability distribution `w_jitter` is provided, then item weights will be set to $w_{ij} \gamma_{ij}$ where $\gamma_{ij}$ is sampled from `w_jitter`. When combined with `fix_w=True`, this argument may be used to generate instances where the weight of each item is roughly the same, but not exactly identical, across all instances. The prices of the items and the capacities of the knapsacks will be calculated as above, but using these perturbed weights instead.
-
-By default, all generated prices, weights and capacities are rounded to the nearest integer number. If `round=False` is provided, this rounding will be disabled.
-
-
-!!! note "References"
-    * Freville, Arnaud, and Gérard Plateau. *An efficient preprocessing procedure for the multidimensional 0–1 knapsack problem.* Discrete applied mathematics 49.1-3 (1994): 189-212.
-    * Fréville, Arnaud. *The multidimensional 0–1 knapsack problem: An overview.* European Journal of Operational Research 155.1 (2004): 1-21.
-    
-### Challenge A
-
-* 250 variables, 10 constraints, fixed weights
-* $w \sim U(0, 1000), \gamma \sim U(0.95, 1.05)$
-* $K = 500, u \sim U(0, 1), \alpha = 0.25$
-* 500 training instances, 50 test instances
-
-
-```python
-MultiKnapsackGenerator(n=randint(low=250, high=251),
-                       m=randint(low=10, high=11),
-                       w=uniform(loc=0.0, scale=1000.0),
-                       K=uniform(loc=500.0, scale=0.0),
-                       u=uniform(loc=0.0, scale=1.0),
-                       alpha=uniform(loc=0.25, scale=0.0),
-                       fix_w=True,
-                       w_jitter=uniform(loc=0.95, scale=0.1),
-                      )
-```
-
-![alt](figures/benchmark_knapsack_a.png)
-
diff --git a/docs/customization.md b/docs/customization.md
deleted file mode 100644
index cc087fc..0000000
--- a/docs/customization.md
+++ /dev/null
@@ -1,182 +0,0 @@
-```{sectnum}
----
-start: 3
-depth: 2
-suffix: .
----
-```
-
-# Customization
-
-## Customizing solver parameters
-
-### Selecting the internal MIP solver
-
-By default, `LearningSolver` uses [Gurobi](https://www.gurobi.com/) as its internal MIP solver, and expects models to be provided using the Pyomo modeling language. Supported solvers and modeling languages include:
-
-* `GurobiPyomoSolver`: Gurobi with Pyomo (default).
-* `CplexPyomoSolver`: [IBM ILOG CPLEX](https://www.ibm.com/products/ilog-cplex-optimization-studio) with Pyomo.
-* `XpressPyomoSolver`: [FICO XPRESS Solver](https://www.fico.com/en/products/fico-xpress-solver) with Pyomo.
-* `GurobiSolver`: Gurobi without any modeling language.
-
-To switch between solvers, provide the desired class using the `solver` argument:
-
-```python
-from miplearn import LearningSolver, CplexPyomoSolver
-solver = LearningSolver(solver=CplexPyomoSolver)
-```
-
-To configure a particular solver, use the `params` constructor argument, as shown below.
-
-```python
-from miplearn import LearningSolver, GurobiPyomoSolver
-solver = LearningSolver(
-    solver=lambda: GurobiPyomoSolver(
-        params={
-            "TimeLimit": 900,
-            "MIPGap": 1e-3,
-            "NodeLimit": 1000,
-        }
-    ),
-)
-```
-
-
-## Customizing solver components
-
-`LearningSolver` is composed by a number of individual machine-learning components, each targeting a different part of the solution process. Each component can be individually enabled, disabled or customized. The following components are enabled by default:
-
-* `LazyConstraintComponent`: Predicts which lazy constraint to initially enforce.
-* `ObjectiveValueComponent`: Predicts the optimal value of the optimization problem, given the optimal solution to the LP relaxation.
-* `PrimalSolutionComponent`: Predicts optimal values for binary decision variables. In heuristic mode, this component fixes the variables to their predicted values. In exact mode, the predicted values are provided to the solver as a (partial) MIP start.
-
-The following components are also available, but not enabled by default:
-
-* `BranchPriorityComponent`: Predicts good branch priorities for decision variables.
-
-### Selecting components
-
-To create a `LearningSolver` with a specific set of components, the `components` constructor argument may be used, as the next example shows:
-
-```python
-# Create a solver without any components
-solver1 = LearningSolver(components=[])
-
-# Create a solver with only two components
-solver2 = LearningSolver(components=[
-    LazyConstraintComponent(...),
-    PrimalSolutionComponent(...),
-])
-```
-
-### Adjusting component aggressiveness
-
-The aggressiveness of classification components, such as `PrimalSolutionComponent` and `LazyConstraintComponent`, can be adjusted through the `threshold` constructor argument. Internally, these components ask the machine learning models how confident are they on each prediction they make, then automatically discard all predictions that have low confidence. The `threshold` argument specifies how confident should the ML models be for a prediction to be considered trustworthy. Lowering a component's threshold increases its aggressiveness, while raising a component's threshold makes it more conservative.
-
-For example, if the ML model predicts that a certain binary variable will assume value `1.0` in the optimal solution with 75% confidence, and if the `PrimalSolutionComponent` is configured to discard all predictions with less than 90% confidence, then this variable will not be included in the predicted MIP start.
-
-MIPLearn currently provides two types of thresholds:
-
-* `MinProbabilityThreshold(p: List[float])` A threshold which indicates that a prediction is trustworthy if its probability of being correct, as computed by the machine learning model, is above a fixed value.
-* `MinPrecisionThreshold(p: List[float])` A dynamic threshold which automatically adjusts itself during training to ensure that the component achieves at least a given precision on the training data set. Note that increasing a component's precision may reduce its recall.
-
-The example below shows how to build a `PrimalSolutionComponent` which fixes variables to zero with at least 80% precision, and to one with at least 95% precision. Other components are configured similarly.
-
-```python
-from miplearn import PrimalSolutionComponent, MinPrecisionThreshold
-
-PrimalSolutionComponent(
-    mode="heuristic",
-    threshold=MinPrecisionThreshold([0.80, 0.95]),
-)
-```
-
-### Evaluating component performance
-
-MIPLearn allows solver components to be modified, trained and evaluated in isolation. In the following example, we build and
-fit `PrimalSolutionComponent` outside the solver, then evaluate its performance.
-
-```python
-from miplearn import PrimalSolutionComponent
-
-# User-provided set of previously-solved instances
-train_instances = [...]
-
-# Construct and fit component on a subset of training instances
-comp = PrimalSolutionComponent()
-comp.fit(train_instances[:100])
-
-# Evaluate performance on an additional set of training instances
-ev = comp.evaluate(train_instances[100:150])
-``` 
-
-The method `evaluate` returns a dictionary with performance evaluation statistics for each training instance provided,
-and for each type of prediction the component makes. To obtain a summary across all instances, pandas may be used, as below:
-
-```python
-import pandas as pd
-pd.DataFrame(ev["Fix one"]).mean(axis=1)
-```
-```text
-Predicted positive          3.120000
-Predicted negative        196.880000
-Condition positive         62.500000
-Condition negative        137.500000
-True positive               3.060000
-True negative             137.440000
-False positive              0.060000
-False negative             59.440000
-Accuracy                    0.702500
-F1 score                    0.093050
-Recall                      0.048921
-Precision                   0.981667
-Predicted positive (%)      1.560000
-Predicted negative (%)     98.440000
-Condition positive (%)     31.250000
-Condition negative (%)     68.750000
-True positive (%)           1.530000
-True negative (%)          68.720000
-False positive (%)          0.030000
-False negative (%)         29.720000
-dtype: float64
-```
-
-Regression components (such as `ObjectiveValueComponent`) can also be trained and evaluated similarly,
-as the next example shows:
-
-```python
-from miplearn import ObjectiveValueComponent
-comp = ObjectiveValueComponent()
-comp.fit(train_instances[:100])
-ev = comp.evaluate(train_instances[100:150])
-
-import pandas as pd
-pd.DataFrame(ev).mean(axis=1)
-```
-```text
-Mean squared error       7001.977827
-Explained variance          0.519790
-Max error                 242.375804
-Mean absolute error        65.843924
-R2                          0.517612
-Median absolute error      65.843924
-dtype: float64
-```
-
-### Using customized ML classifiers and regressors
-
-By default, given a training set of instantes, MIPLearn trains a fixed set of ML classifiers and regressors, then selects the best one based on cross-validation performance. Alternatively, the user may specify which ML model a component should use through the `classifier` or `regressor` contructor parameters. Scikit-learn classifiers and regressors are currently supported. A future version of the package will add compatibility with Keras models.
-
-The example below shows how to construct a `PrimalSolutionComponent` which internally uses scikit-learn's `KNeighborsClassifiers`. Any other scikit-learn classifier or pipeline can be used. It needs to be wrapped in `ScikitLearnClassifier` to ensure that all the proper data transformations are applied.
-
-```python
-from miplearn import PrimalSolutionComponent, ScikitLearnClassifier
-from sklearn.neighbors import KNeighborsClassifier
-
-comp = PrimalSolutionComponent(
-    classifier=ScikitLearnClassifier(
-        KNeighborsClassifier(n_neighbors=5),
-    ),
-)
-comp.fit(train_instances)
-```