MIPLearn v0.3

2025-12-05 17:08:51 -06:00 · 2023-06-08 11:25:39 -05:00
parent 6cc253a903
commit 1ea989d48a
172 changed files with 10495 additions and 24812 deletions
--- a/.github/ISSUE_TEMPLATE/bug_report.md
+++ b/.github/ISSUE_TEMPLATE/bug_report.md
@@ -1,26 +0,0 @@
 ---
 name: Bug report
 about: Something is broken in the package
 title: ''
 labels: ''
 assignees: ''
 ---
 ## Description
 A clear and concise description of what the bug is.
 ## Steps to Reproduce
 Please describe how can the developers reproduce the problem in their own computers. Code snippets and sample input files are specially helpful. For example:
 1. Install the package
 2. Run the code below with the attached input file...
 3. The following error appears...
 ## System Information
 - Operating System: [e.g. Ubuntu 20.04]
 - Python version: [e.g. 3.6]
 - Solver: [e.g. Gurobi 9.0]
 - Package version: [e.g. 0.1.0]
--- a/.github/ISSUE_TEMPLATE/config.yml
+++ b/.github/ISSUE_TEMPLATE/config.yml
@@ -1,8 +0,0 @@
 blank_issues_enabled: false
 contact_links:
  - name: Feature Request
    url: https://github.com/ANL-CEEESA/MIPLearn/discussions/categories/feature-requests
    about:  Submit ideas for new features and small enhancements
  - name: Help & FAQ
    url: https://github.com/ANL-CEEESA/MIPLearn/discussions/categories/help-faq
    about: Ask questions about the package and get help from the community
--- a/.github/workflows/lint.yml
+++ b/.github/workflows/lint.yml
@@ -1,11 +0,0 @@
 name: Lint
 on: [push, pull_request]
 jobs:
  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - uses: actions/setup-python@v2
      - uses: psf/black@20.8b1
--- a/.github/workflows/test.yml
+++ b/.github/workflows/test.yml
@@ -1,27 +0,0 @@
 name: Test
 on:
  push:
  pull_request:
  schedule:
    - cron: '45 10 * * *'
 jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        python-version: [3.7, 3.8]
    steps:
    - name: Check out source code
      uses: actions/checkout@v2
    - name: Set up Python ${{ matrix.python-version }}
      uses: actions/setup-python@v2
      with:
        python-version: ${{ matrix.python-version }}
    - name: Install dependencies
      run: make install-deps
    - name: Test
      run: make test
--- a/.gitignore
+++ b/.gitignore
@@ -1,5 +1,3 @@
 *.h5
 *.jld2
 TODO.md
 .idea
 *.gz
@@ -80,6 +78,8 @@ wheels/
 notebooks/
 .vscode
 tmp
-benchmark/tsp
+benchmark/data
-benchmark/stab
+benchmark/results
-benchmark/knapsack
+**/*.xz
 **/*.h5
 **/*.jld2
--- a/.mypy.ini
+++ b/.mypy.ini
@@ -4,4 +4,4 @@ disallow_untyped_defs = True
 disallow_untyped_calls = True
 disallow_incomplete_defs = True
 pretty = True
-no_implicit_optional = True
+no_implicit_optional = True
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -1,6 +0,0 @@
 repos:
  - repo: https://github.com/ambv/black
    rev: 20.8b1
    hooks:
    - id: black
      args: ["--check"]
--- a/.zenodo.json
+++ b/.zenodo.json
@@ -0,0 +1,27 @@
 {
    "creators": [
        {
            "orcid": "0000-0002-5022-9802",
            "affiliation": "Argonne National Laboratory",
            "name": "Santos Xavier, Alinson"
        },
        {
            "affiliation": "Argonne National Laboratory",
            "name": "Qiu, Feng"
        },
        {
            "affiliation": "Georgia Institute of Technology",
            "name": "Gu, Xiaoyi"
        },
        {
            "affiliation": "Georgia Institute of Technology",
            "name": "Becu, Berkay"
        },
        {
            "affiliation": "Georgia Institute of Technology",
            "name": "Dey, Santanu S."
        }
    ],
    "title": "MIPLearn: An Extensible Framework for Learning-Enhanced Optimization",
    "description": "<b>MIPLearn</b> is an extensible framework for solving discrete optimization problems using a combination of Mixed-Integer Linear Programming (MIP) and Machine Learning (ML). MIPLearn uses ML methods to automatically identify patterns in previously solved instances of the problem, then uses these patterns to accelerate the performance of conventional state-of-the-art MIP solvers such as CPLEX, Gurobi or XPRESS."
 }
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,45 +1,34 @@
-# MIPLearn: Changelog
+# Changelog
-## [0.2.0] - [Unreleased]
+All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 ## [0.3.0] - 2023-06-08
 This is a complete rewrite of the original prototype package, with an entirely new API, focused on performance, scalability and flexibility.
 ### Added
- **Added two new machine learning components:**
+- Add support for Python/Gurobipy and Julia/JuMP, in addition to the existing Python/Pyomo interface.
-  - Added `StaticLazyConstraintComponent`, which allows the user to mark some constraints in the formulation as lazy, instead of constructing them in a callback. ML predicts which static lazy constraints should be kept in the formulation, and which should be removed. 
+- Add six new random instance generators (bin packing, capacitated p-median, set cover, set packing, unit commitment, vertex cover), in addition to the three existing generators (multiknapsack, stable set, tsp).
-  - Added `UserCutComponents`, which predicts which user cuts should be generated and added to the formulation as constraints ahead-of-time, before solving the MIP.
+- Collect some additional raw training data (e.g. basis status, reduced costs, etc)
- **Added support to additional MILP solvers:**
+- Add new primal solution ML strategies (memorizing, independent vars and joint vars)
-  - Added support for CPLEX and XPRESS, through the Pyomo modeling language, in addition to (existing) Gurobi. The solver classes are named `CplexPyomoSolver`, `XpressPyomoSolver` and `GurobiPyomoSolver`.
+- Add new primal solution actions (set warm start, fix variables, enforce proximity)
-  - Added support for Gurobi without any modeling language. The solver class is named `GurobiSolver`. In this case, `instance.to_model` should return ` gp.Model` object.
+- Add runnable tutorials and user guides to the documentation.
  - Added support to direct MPS files, produced externally, through the `GurobiSolver` class mentioned above.
 - **Added dynamic thresholds:** 
  - In previous versions of the package, it was necessary to manually adjust component aggressiveness to reach a desired precision/recall. This can now be done automatically with `MinProbabilityThreshold`, `MinPrecisionThreshold` and `MinRecallThreshold`.
 - **Reduced memory requirements:**
  - Previous versions of the package required all training instances to be kept in memory at all times, which was prohibitive for large-scale problems. It is now possible to store instances in file until they are needed, using `PickledGzInstance`.
 - **Refactoring:**
  - Added static types to all classes (with mypy).
 ### Changed
- Variables are now referenced by their names, instead of tuples `(var_name, index)`. This change was required to improve the compatibility with modeling languages other than Pyomo, which do not follow this convention. For performance reasons, the functions `get_variable_features` and `get_variable_categories` should now return a dictionary containing categories and features for all relevant variables. Previously, MIPLearn had to perform two function calls per variable, which was too slow for very large models.
+- To support large-scale problems and datasets, switch from an in-memory architecture to a file-based architecture, using HDF5 files.
- Internal solvers must now be specified as objects, instead of strings. For example,
+- To accelerate development cycle, split training data collection from feature extraction.
  ```python
  solver = LearningSolver(
      solver=GurobiPyomoSolver(
          params={
              "TimeLimit": 300,
              "Threads": 4,
          }      
      )
  )
  ```
 - `LazyConstraintComponent` has been renamed to `DynamicLazyConstraintsComponent`.
 - Categories, lazy constraints and cutting plane identifiers must now be strings, instead `Hashable`. This change was required for compatibility with HDF5 data format.
 ### Removed
- Temporarily removed the experimental `BranchPriorityComponent`. This component will be re-added in the Julia version of the package.
+- Temporarily remove ML strategies for lazy constraints
- Removed `solver.add` method, previously used to add components to an existing solver. Use the constructor `LearningSolver(components=[...])` instead.
+- Remove benchmarks from documentation. These will be published in a separate paper.
 ## [0.1.0] - 2020-11-23
- Initial public release
+- Initial public release
--- a/2
+++ b/2
@@ -22,4 +22,4 @@ DISCLAIMER
 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-********************************************************************************
+********************************************************************************
--- a/13
+++ b/13
@@ -2,8 +2,8 @@ PYTHON      := python3
 PYTEST      := pytest
 PIP         := $(PYTHON) -m pip
 MYPY        := $(PYTHON) -m mypy
-PYTEST_ARGS := -W ignore::DeprecationWarning -vv --log-level=DEBUG tests
+PYTEST_ARGS := -W ignore::DeprecationWarning -vv --log-level=DEBUG
-VERSION     := 0.2
+VERSION     := 0.3
 all: docs test
@@ -24,11 +24,8 @@ docs:
 	cd docs; make clean; make dirhtml
 	rsync -avP --delete-after docs/_build/dirhtml/ ../docs/$(VERSION)
 install-deps:
 	$(PIP) install --upgrade pip
 	$(PIP) install --upgrade -i https://pypi.gurobi.com 'gurobipy>=9.5,<9.6'
 	$(PIP) install --upgrade xpress
 	$(PIP) install --upgrade -r requirements.txt
 install:
@@ -41,9 +38,11 @@ reformat:
 	$(PYTHON) -m black .
 test:
 	# pyflakes miplearn tests
 	black --check .
 	# rm -rf .mypy_cache
-	# $(MYPY) -p miplearn
+	$(MYPY) -p miplearn
-	# $(MYPY) -p tests
+	$(MYPY) -p tests
 	$(PYTEST) $(PYTEST_ARGS) 
 .PHONY: test test-watch docs install dist
--- a/README.md
+++ b/README.md
@@ -14,36 +14,51 @@
  </a>
 </p>
-**MIPLearn** is an extensible framework for solving discrete optimization problems using a combination of Mixed-Integer Linear Programming (MIP) and Machine Learning (ML).
+**MIPLearn** is an extensible framework for solving discrete optimization problems using a combination of Mixed-Integer Linear Programming (MIP) and Machine Learning (ML). MIPLearn uses ML methods to automatically identify patterns in previously solved instances of the problem, then uses these patterns to accelerate the performance of conventional state-of-the-art MIP solvers such as CPLEX, Gurobi or XPRESS.
-MIPLearn uses ML methods to automatically identify patterns in previously solved instances of the problem, then uses these patterns to accelerate the performance of conventional state-of-the-art MIP solvers such as CPLEX, Gurobi or XPRESS. Unlike pure ML methods, MIPLearn is not only able to find high-quality solutions to discrete optimization problems, but it can also prove the optimality and feasibility of these solutions. Unlike conventional MIP solvers, MIPLearn can take full advantage of very specific observations that happen to be true in a particular family of instances (such as the observation that a particular constraint is typically redundant, or that a particular variable typically assumes a certain value). For certain classes of problems, this approach has been shown to provide significant performance benefits (see [benchmarks](https://anl-ceeesa.github.io/MIPLearn/0.1/problems/) and [references](https://anl-ceeesa.github.io/MIPLearn/0.1/about/)).
+Unlike pure ML methods, MIPLearn is not only able to find high-quality solutions to discrete optimization problems, but it can also prove the optimality and feasibility of these solutions. Unlike conventional MIP solvers, MIPLearn can take full advantage of very specific observations that happen to be true in a particular family of instances (such as the observation that a particular constraint is typically redundant, or that a particular variable typically assumes a certain value). For certain classes of problems, this approach may provide significant performance benefits.
 Features
 --------
 * **MIPLearn proposes a flexible problem specification format,** which allows users to describe their particular optimization problems to a Learning-Enhanced MIP solver, both from the MIP perspective and from the ML perspective, without making any assumptions on the problem being modeled, the mathematical formulation of the problem, or ML encoding.
 * **MIPLearn provides a reference implementation of a *Learning-Enhanced Solver*,** which can use the above problem specification format to automatically predict, based on previously solved instances, a number of hints to accelerate MIP performance. 
 * **MIPLearn provides a set of benchmark problems and random instance generators,** covering applications from different domains, which can be used to quickly evaluate new learning-enhanced MIP techniques in a measurable and reproducible way.
 * **MIPLearn is customizable and extensible**. For MIP and ML researchers exploring new techniques to accelerate MIP performance based on historical data, each component of the reference solver can be individually replaced, extended or customized.
 Documentation
 -------------
-For installation instructions, basic usage and benchmarks results, see the [official documentation](https://anl-ceeesa.github.io/MIPLearn/).
+- Tutorials:
    1. [Getting started (Pyomo)](https://anl-ceeesa.github.io/MIPLearn/0.3/tutorials/getting-started-pyomo/)
    2. [Getting started (Gurobipy)](https://anl-ceeesa.github.io/MIPLearn/0.3/tutorials/getting-started-gurobipy/)
    3. [Getting started (JuMP)](https://anl-ceeesa.github.io/MIPLearn/0.3/tutorials/getting-started-jump/)
 - User Guide
    1. [Benchmark problems](https://anl-ceeesa.github.io/MIPLearn/0.3/guide/problems/)
    2. [Training data collectors](https://anl-ceeesa.github.io/MIPLearn/0.3/guide/collectors/)
    3. [Feature extractors](https://anl-ceeesa.github.io/MIPLearn/0.3/guide/features/)
    4. [Primal components](https://anl-ceeesa.github.io/MIPLearn/0.3/guide/primal/)
    5. [Learning solver](https://anl-ceeesa.github.io/MIPLearn/0.3/guide/solvers/)
 - Python API Reference
    1. [Benchmark problems](https://anl-ceeesa.github.io/MIPLearn/0.3/api/problems/)
    2. [Collectors & extractors](https://anl-ceeesa.github.io/MIPLearn/0.3/api/collectors/)
    3. [Components](https://anl-ceeesa.github.io/MIPLearn/0.3/api/components/)
    4. [Solvers](https://anl-ceeesa.github.io/MIPLearn/0.3/api/solvers/)
    5. [Helpers](https://anl-ceeesa.github.io/MIPLearn/0.3/api/helpers/)
 Authors
 -------
 - **Alinson S. Xavier** (Argonne National Laboratory)
 - **Feng Qiu** (Argonne National Laboratory)
 - **Xiaoyi Gu** (Georgia Institute of Technology)
 - **Berkay Becu** (Georgia Institute of Technology)
 - **Santanu S. Dey**  (Georgia Institute of Technology)
 Acknowledgments
 ---------------
-* Based upon work supported by **Laboratory Directed Research and Development** (LDRD) funding from Argonne National Laboratory, provided by the Director, Office of Science, of the U.S. Department of Energy under Contract No. DE-AC02-06CH11357.
+* Based upon work supported by **Laboratory Directed Research and Development** (LDRD) funding from Argonne National Laboratory, provided by the Director, Office of Science, of the U.S. Department of Energy.
-* Based upon work supported by the **U.S. Department of Energy Advanced Grid Modeling Program** under Grant DE-OE0000875.
+* Based upon work supported by the **U.S. Department of Energy Advanced Grid Modeling Program**.
 Citing MIPLearn
 ---------------
 If you use MIPLearn in your research (either the solver or the included problem generators), we kindly request that you cite the package as follows:
-* **Alinson S. Xavier, Feng Qiu.** *MIPLearn: An Extensible Framework for Learning-Enhanced Optimization*. Zenodo (2020). DOI: [10.5281/zenodo.4287567](https://doi.org/10.5281/zenodo.4287567)
+* **Alinson S. Xavier, Feng Qiu, Xiaoyi Gu, Berkay Becu, Santanu S. Dey.** *MIPLearn: An Extensible Framework for Learning-Enhanced Optimization (Version 0.3)*. Zenodo (2023). DOI: [10.5281/zenodo.4287567](https://doi.org/10.5281/zenodo.4287567)
 If you use MIPLearn in the field of power systems optimization, we kindly request that you cite the reference below, in which the main techniques implemented in MIPLearn were first developed:
--- a/docs/Makefile
+++ b/docs/Makefile
@@ -1,8 +1,14 @@
 # Minimal makefile for Sphinx documentation
 #
 # You can set these variables from the command line, and also
 # from the environment for the first two.
 SPHINXOPTS    ?=
 SPHINXBUILD   ?= sphinx-build
 SOURCEDIR     = .
 BUILDDIR      = _build
 # Put it first so that "make" without argument is like "make help".
 help:
 	@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
--- a/docs/_static/custom.css
+++ b/docs/_static/custom.css
@@ -8,11 +8,10 @@ h1.site-logo small {
 code {
    display: inline-block;
    color: #222 !important;
    background-color: rgba(0 0 0 / 8%);
    border-radius: 4px;
    padding: 0 4px;
-
+    background-color: #eee;
    color: rgb(232, 62, 140);
 }
 .right-next, .left-prev {
@@ -50,13 +49,72 @@ code {
 .output_area pre {
    color: #fff;
    line-height: 20px !important;
 }
 .input_area pre {
    background-color: rgba(0 0 0 / 3%) !important;
    padding: 12px !important;
    line-height: 20px;
 }
 .ansi-green-intense-fg {
    color: #64d88b !important;
-}
+}
 #site-navigation {
    background-color: #fafafa;
 }
 .container, .container-lg, .container-md, .container-sm, .container-xl {
    max-width: inherit !important;
 }
 h1, h2 {
    font-weight: bold !important;
 }
 #main-content .section {
    max-width: 900px !important;
    margin: 0 auto !important;
    font-size: 16px;
 }
 p.caption {
    font-weight: bold;
 }
 h2 {
    padding-bottom: 5px;
    border-bottom: 1px solid #ccc;
 }
 h3 {
    margin-top: 1.5rem;
 }
 tbody, thead, pre {
    border: 1px solid rgba(0, 0, 0, 0.25);
 }
 table td, th {
    padding: 8px;
 }
 table p {
    margin-bottom: 0;
 }
 table td code {
    white-space: nowrap;
 }
 table tr,
 table th {
    border-bottom: 1px solid rgba(0, 0, 0, 0.1);
 }
 table tr:last-child {
    border-bottom: 0;
 }
--- a/docs/api/collectors.rst
+++ b/docs/api/collectors.rst
@@ -0,0 +1,42 @@
 Collectors & Extractors
 =======================
 miplearn.classifiers.minprob
 ----------------------------
 .. automodule:: miplearn.classifiers.minprob
   :members:
   :undoc-members:
   :show-inheritance:
 miplearn.classifiers.singleclass
 --------------------------------
 .. automodule:: miplearn.classifiers.singleclass
   :members:
   :undoc-members:
   :show-inheritance:
 miplearn.collectors.basic
 -------------------------
 .. automodule:: miplearn.collectors.basic
   :members:
   :undoc-members:
   :show-inheritance:
 miplearn.extractors.fields
 --------------------------
 .. automodule:: miplearn.extractors.fields
   :members:
   :undoc-members:
   :show-inheritance:
 miplearn.extractors.AlvLouWeh2017
 ---------------------------------
 .. automodule:: miplearn.extractors.AlvLouWeh2017
   :members:
   :undoc-members:
   :show-inheritance:
--- a/docs/api/components.rst
+++ b/docs/api/components.rst
@@ -0,0 +1,44 @@
 Components
 ==========
 miplearn.components.primal.actions
 ----------------------------------
 .. automodule:: miplearn.components.primal.actions
   :members:
   :undoc-members:
   :show-inheritance:
 miplearn.components.primal.expert
 ----------------------------------
 .. automodule:: miplearn.components.primal.expert
   :members:
   :undoc-members:
   :show-inheritance:
 miplearn.components.primal.indep
 ----------------------------------
 .. automodule:: miplearn.components.primal.indep
   :members:
   :undoc-members:
   :show-inheritance:
 miplearn.components.primal.joint
 ----------------------------------
 .. automodule:: miplearn.components.primal.joint
   :members:
   :undoc-members:
   :show-inheritance:
 miplearn.components.primal.mem
 ----------------------------------
 .. automodule:: miplearn.components.primal.mem
   :members:
   :undoc-members:
   :show-inheritance:
--- a/docs/api/helpers.rst
+++ b/docs/api/helpers.rst
@@ -0,0 +1,18 @@
 Helpers
 =======
 miplearn.io
 -----------
 .. automodule:: miplearn.io
   :members:
   :undoc-members:
   :show-inheritance:
 miplearn.h5
 -----------
 .. automodule:: miplearn.h5
   :members:
   :undoc-members:
   :show-inheritance:
--- a/docs/api/problems.rst
+++ b/docs/api/problems.rst
@@ -0,0 +1,57 @@
 Benchmark Problems
 ==================
 miplearn.problems.binpack
 -------------------------
 .. automodule:: miplearn.problems.binpack
   :members:
 miplearn.problems.multiknapsack
 -------------------------------
 .. automodule:: miplearn.problems.multiknapsack
   :members:
 miplearn.problems.pmedian
 -------------------------
 .. automodule:: miplearn.problems.pmedian
   :members:
 miplearn.problems.setcover
 --------------------------
 .. automodule:: miplearn.problems.setcover
   :members:
 miplearn.problems.setpack
 -------------------------
 .. automodule:: miplearn.problems.setpack
   :members:
 miplearn.problems.stab
 ----------------------
 .. automodule:: miplearn.problems.stab
   :members:
 miplearn.problems.tsp
 ---------------------
 .. automodule:: miplearn.problems.tsp
   :members:
 miplearn.problems.uc
 --------------------
 .. automodule:: miplearn.problems.uc
   :members:
 miplearn.problems.vertexcover
 -----------------------------
 .. automodule:: miplearn.problems.vertexcover
   :members:
--- a/docs/api/solvers.rst
+++ b/docs/api/solvers.rst
@@ -0,0 +1,26 @@
 Solvers
 =======
 miplearn.solvers.abstract
 -------------------------
 .. automodule:: miplearn.solvers.abstract
   :members:
   :undoc-members:
   :show-inheritance:
 miplearn.solvers.gurobi
 -------------------------
 .. automodule:: miplearn.solvers.gurobi
   :members:
   :undoc-members:
   :show-inheritance:
 miplearn.solvers.learning
 -------------------------
 .. automodule:: miplearn.solvers.learning
   :members:
   :undoc-members:
   :show-inheritance:
--- a/docs/benchmarks/Manifest.toml
+++ b/docs/benchmarks/Manifest.toml
@@ -1,671 +0,0 @@
 # This file is machine-generated - editing it directly is not advised
 [[ASL_jll]]
 deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"]
 git-tree-sha1 = "370cafc70604b2522f2c7cf9915ebcd17b4cd38b"
 uuid = "ae81ac8f-d209-56e5-92de-9978fef736f9"
 version = "0.1.2+0"
 [[ArgTools]]
 uuid = "0dad84c5-d112-42e6-8d28-ef12dabb789f"
 [[Artifacts]]
 uuid = "56f22d72-fd6d-98f1-02f0-08ddc0907c33"
 [[Base64]]
 uuid = "2a0f44e3-6c83-55bd-87e4-b1978d98bd5f"
 [[BenchmarkTools]]
 deps = ["JSON", "Logging", "Printf", "Profile", "Statistics", "UUIDs"]
 git-tree-sha1 = "61adeb0823084487000600ef8b1c00cc2474cd47"
 uuid = "6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf"
 version = "1.2.0"
 [[BinaryProvider]]
 deps = ["Libdl", "Logging", "SHA"]
 git-tree-sha1 = "ecdec412a9abc8db54c0efc5548c64dfce072058"
 uuid = "b99e7846-7c00-51b0-8f62-c81ae34c0232"
 version = "0.5.10"
 [[Bzip2_jll]]
 deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"]
 git-tree-sha1 = "19a35467a82e236ff51bc17a3a44b69ef35185a2"
 uuid = "6e34b625-4abd-537c-b88f-471c36dfa7a0"
 version = "1.0.8+0"
 [[CEnum]]
 git-tree-sha1 = "215a9aa4a1f23fbd05b92769fdd62559488d70e9"
 uuid = "fa961155-64e5-5f13-b03f-caf6b980ea82"
 version = "0.4.1"
 [[CSV]]
 deps = ["CodecZlib", "Dates", "FilePathsBase", "Mmap", "Parsers", "PooledArrays", "SentinelArrays", "Tables", "Unicode", "WeakRefStrings"]
 git-tree-sha1 = "7c2d71ad51fd4347193463b0a065e4dc7063e248"
 uuid = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b"
 version = "0.9.3"
 [[Calculus]]
 deps = ["LinearAlgebra"]
 git-tree-sha1 = "f641eb0a4f00c343bbc32346e1217b86f3ce9dad"
 uuid = "49dc2e85-a5d0-5ad3-a950-438e2897f1b9"
 version = "0.5.1"
 [[Cbc]]
 deps = ["BinaryProvider", "CEnum", "Cbc_jll", "Libdl", "MathOptInterface", "SparseArrays"]
 git-tree-sha1 = "98e3692f90b26a340f32e17475c396c3de4180de"
 uuid = "9961bab8-2fa3-5c5a-9d89-47fab24efd76"
 version = "0.8.1"
 [[Cbc_jll]]
 deps = ["ASL_jll", "Artifacts", "Cgl_jll", "Clp_jll", "CoinUtils_jll", "CompilerSupportLibraries_jll", "JLLWrappers", "Libdl", "OpenBLAS32_jll", "Osi_jll", "Pkg"]
 git-tree-sha1 = "7693a7ca006d25e0d0097a5eee18ce86368e00cd"
 uuid = "38041ee0-ae04-5750-a4d2-bb4d0d83d27d"
 version = "200.1000.500+1"
 [[Cgl_jll]]
 deps = ["Artifacts", "Clp_jll", "CoinUtils_jll", "CompilerSupportLibraries_jll", "JLLWrappers", "Libdl", "Osi_jll", "Pkg"]
 git-tree-sha1 = "b5557f48e0e11819bdbda0200dbfa536dd12d9d9"
 uuid = "3830e938-1dd0-5f3e-8b8e-b3ee43226782"
 version = "0.6000.200+0"
 [[ChainRulesCore]]
 deps = ["Compat", "LinearAlgebra", "SparseArrays"]
 git-tree-sha1 = "4ce9393e871aca86cc457d9f66976c3da6902ea7"
 uuid = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4"
 version = "1.4.0"
 [[Clp]]
 deps = ["BinaryProvider", "CEnum", "Clp_jll", "Libdl", "MathOptInterface", "SparseArrays"]
 git-tree-sha1 = "3df260c4a5764858f312ec2a17f5925624099f3a"
 uuid = "e2554f3b-3117-50c0-817c-e040a3ddf72d"
 version = "0.8.4"
 [[Clp_jll]]
 deps = ["Artifacts", "CoinUtils_jll", "CompilerSupportLibraries_jll", "JLLWrappers", "Libdl", "METIS_jll", "MUMPS_seq_jll", "OpenBLAS32_jll", "Osi_jll", "Pkg"]
 git-tree-sha1 = "5e4f9a825408dc6356e6bf1015e75d2b16250ec8"
 uuid = "06985876-5285-5a41-9fcb-8948a742cc53"
 version = "100.1700.600+0"
 [[CodecBzip2]]
 deps = ["Bzip2_jll", "Libdl", "TranscodingStreams"]
 git-tree-sha1 = "2e62a725210ce3c3c2e1a3080190e7ca491f18d7"
 uuid = "523fee87-0ab8-5b00-afb7-3ecf72e48cfd"
 version = "0.7.2"
 [[CodecZlib]]
 deps = ["TranscodingStreams", "Zlib_jll"]
 git-tree-sha1 = "ded953804d019afa9a3f98981d99b33e3db7b6da"
 uuid = "944b1d66-785c-5afd-91f1-9de20f533193"
 version = "0.7.0"
 [[CoinUtils_jll]]
 deps = ["Artifacts", "CompilerSupportLibraries_jll", "JLLWrappers", "Libdl", "OpenBLAS32_jll", "Pkg"]
 git-tree-sha1 = "9b4a8b1087376c56189d02c3c1a48a0bba098ec2"
 uuid = "be027038-0da8-5614-b30d-e42594cb92df"
 version = "2.11.4+2"
 [[CommonSubexpressions]]
 deps = ["MacroTools", "Test"]
 git-tree-sha1 = "7b8a93dba8af7e3b42fecabf646260105ac373f7"
 uuid = "bbf7d656-a473-5ed7-a52c-81e309532950"
 version = "0.3.0"
 [[Compat]]
 deps = ["Base64", "Dates", "DelimitedFiles", "Distributed", "InteractiveUtils", "LibGit2", "Libdl", "LinearAlgebra", "Markdown", "Mmap", "Pkg", "Printf", "REPL", "Random", "SHA", "Serialization", "SharedArrays", "Sockets", "SparseArrays", "Statistics", "Test", "UUIDs", "Unicode"]
 git-tree-sha1 = "4866e381721b30fac8dda4c8cb1d9db45c8d2994"
 uuid = "34da2185-b29b-5c13-b0c7-acf172513d20"
 version = "3.37.0"
 [[CompilerSupportLibraries_jll]]
 deps = ["Artifacts", "Libdl"]
 uuid = "e66e0078-7015-5450-92f7-15fbd957f2ae"
 [[Conda]]
 deps = ["JSON", "VersionParsing"]
 git-tree-sha1 = "299304989a5e6473d985212c28928899c74e9421"
 uuid = "8f4d0f93-b110-5947-807f-2305c1781a2d"
 version = "1.5.2"
 [[Crayons]]
 git-tree-sha1 = "3f71217b538d7aaee0b69ab47d9b7724ca8afa0d"
 uuid = "a8cc5b0e-0ffa-5ad4-8c14-923d3ee1735f"
 version = "4.0.4"
 [[DataAPI]]
 git-tree-sha1 = "bec2532f8adb82005476c141ec23e921fc20971b"
 uuid = "9a962f9c-6df0-11e9-0e5d-c546b8b5ee8a"
 version = "1.8.0"
 [[DataFrames]]
 deps = ["Compat", "DataAPI", "Future", "InvertedIndices", "IteratorInterfaceExtensions", "LinearAlgebra", "Markdown", "Missings", "PooledArrays", "PrettyTables", "Printf", "REPL", "Reexport", "SortingAlgorithms", "Statistics", "TableTraits", "Tables", "Unicode"]
 git-tree-sha1 = "d785f42445b63fc86caa08bb9a9351008be9b765"
 uuid = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
 version = "1.2.2"
 [[DataStructures]]
 deps = ["Compat", "InteractiveUtils", "OrderedCollections"]
 git-tree-sha1 = "7d9d316f04214f7efdbb6398d545446e246eff02"
 uuid = "864edb3b-99cc-5e75-8d2d-829cb0a9cfe8"
 version = "0.18.10"
 [[DataValueInterfaces]]
 git-tree-sha1 = "bfc1187b79289637fa0ef6d4436ebdfe6905cbd6"
 uuid = "e2d170a0-9d28-54be-80f0-106bbe20a464"
 version = "1.0.0"
 [[Dates]]
 deps = ["Printf"]
 uuid = "ade2ca70-3891-5945-98fb-dc099432e06a"
 [[DelimitedFiles]]
 deps = ["Mmap"]
 uuid = "8bb1440f-4735-579b-a4ab-409b98df4dab"
 [[DiffResults]]
 deps = ["StaticArrays"]
 git-tree-sha1 = "c18e98cba888c6c25d1c3b048e4b3380ca956805"
 uuid = "163ba53b-c6d8-5494-b064-1a9d43ac40c5"
 version = "1.0.3"
 [[DiffRules]]
 deps = ["NaNMath", "Random", "SpecialFunctions"]
 git-tree-sha1 = "7220bc21c33e990c14f4a9a319b1d242ebc5b269"
 uuid = "b552c78f-8df3-52c6-915a-8e097449b14b"
 version = "1.3.1"
 [[Distributed]]
 deps = ["Random", "Serialization", "Sockets"]
 uuid = "8ba89e20-285c-5b6f-9357-94700520ee1b"
 [[Distributions]]
 deps = ["ChainRulesCore", "FillArrays", "LinearAlgebra", "PDMats", "Printf", "QuadGK", "Random", "SparseArrays", "SpecialFunctions", "Statistics", "StatsBase", "StatsFuns"]
 git-tree-sha1 = "f4efaa4b5157e0cdb8283ae0b5428bc9208436ed"
 uuid = "31c24e10-a181-5473-b8eb-7969acd0382f"
 version = "0.25.16"
 [[DocStringExtensions]]
 deps = ["LibGit2"]
 git-tree-sha1 = "a32185f5428d3986f47c2ab78b1f216d5e6cc96f"
 uuid = "ffbed154-4ef7-542d-bbb7-c09d3a79fcae"
 version = "0.8.5"
 [[Downloads]]
 deps = ["ArgTools", "LibCURL", "NetworkOptions"]
 uuid = "f43a241f-c20a-4ad4-852c-f6b1247861c6"
 [[ExprTools]]
 git-tree-sha1 = "b7e3d17636b348f005f11040025ae8c6f645fe92"
 uuid = "e2ba6199-217a-4e67-a87a-7c52f15ade04"
 version = "0.1.6"
 [[FileIO]]
 deps = ["Pkg", "Requires", "UUIDs"]
 git-tree-sha1 = "3c041d2ac0a52a12a27af2782b34900d9c3ee68c"
 uuid = "5789e2e9-d7fb-5bc7-8068-2c6fae9b9549"
 version = "1.11.1"
 [[FilePathsBase]]
 deps = ["Dates", "Mmap", "Printf", "Test", "UUIDs"]
 git-tree-sha1 = "6d4b609786127030d09e6b1ee0e2044ec20eb403"
 uuid = "48062228-2e41-5def-b9a4-89aafe57970f"
 version = "0.9.11"
 [[FillArrays]]
 deps = ["LinearAlgebra", "Random", "SparseArrays", "Statistics"]
 git-tree-sha1 = "caf289224e622f518c9dbfe832cdafa17d7c80a6"
 uuid = "1a297f60-69ca-5386-bcde-b61e274b549b"
 version = "0.12.4"
 [[Formatting]]
 deps = ["Printf"]
 git-tree-sha1 = "8339d61043228fdd3eb658d86c926cb282ae72a8"
 uuid = "59287772-0a20-5a39-b81b-1366585eb4c0"
 version = "0.4.2"
 [[ForwardDiff]]
 deps = ["CommonSubexpressions", "DiffResults", "DiffRules", "LinearAlgebra", "NaNMath", "Printf", "Random", "SpecialFunctions", "StaticArrays"]
 git-tree-sha1 = "b5e930ac60b613ef3406da6d4f42c35d8dc51419"
 uuid = "f6369f11-7733-5829-9624-2563aa707210"
 version = "0.10.19"
 [[Future]]
 deps = ["Random"]
 uuid = "9fa8497b-333b-5362-9e8d-4d0656e87820"
 [[GZip]]
 deps = ["Libdl"]
 git-tree-sha1 = "039be665faf0b8ae36e089cd694233f5dee3f7d6"
 uuid = "92fee26a-97fe-5a0c-ad85-20a5f3185b63"
 version = "0.5.1"
 [[Gurobi]]
 deps = ["CEnum", "Libdl", "MathOptInterface"]
 git-tree-sha1 = "aac05324d46b53289ccb05510b05b4a56ffd3ed5"
 uuid = "2e9cd046-0924-5485-92f1-d5272153d98b"
 version = "0.9.14"
 [[HTTP]]
 deps = ["Base64", "Dates", "IniFile", "Logging", "MbedTLS", "NetworkOptions", "Sockets", "URIs"]
 git-tree-sha1 = "60ed5f1643927479f845b0135bb369b031b541fa"
 uuid = "cd3eb016-35fb-5094-929b-558a96fad6f3"
 version = "0.9.14"
 [[IniFile]]
 deps = ["Test"]
 git-tree-sha1 = "098e4d2c533924c921f9f9847274f2ad89e018b8"
 uuid = "83e8ac13-25f8-5344-8a64-a9f2b223428f"
 version = "0.5.0"
 [[InteractiveUtils]]
 deps = ["Markdown"]
 uuid = "b77e0a4c-d291-57a0-90e8-8db25a27a240"
 [[InvertedIndices]]
 git-tree-sha1 = "bee5f1ef5bf65df56bdd2e40447590b272a5471f"
 uuid = "41ab1584-1d38-5bbf-9106-f11c6c58b48f"
 version = "1.1.0"
 [[IrrationalConstants]]
 git-tree-sha1 = "f76424439413893a832026ca355fe273e93bce94"
 uuid = "92d709cd-6900-40b7-9082-c6be49f344b6"
 version = "0.1.0"
 [[IteratorInterfaceExtensions]]
 git-tree-sha1 = "a3f24677c21f5bbe9d2a714f95dcd58337fb2856"
 uuid = "82899510-4779-5014-852e-03e436cf321d"
 version = "1.0.0"
 [[JLD2]]
 deps = ["DataStructures", "FileIO", "MacroTools", "Mmap", "Pkg", "Printf", "Reexport", "TranscodingStreams", "UUIDs"]
 git-tree-sha1 = "192934b3e2a94e897ce177423fd6cf7bdf464bce"
 uuid = "033835bb-8acc-5ee8-8aae-3f567f8a3819"
 version = "0.4.14"
 [[JLLWrappers]]
 deps = ["Preferences"]
 git-tree-sha1 = "642a199af8b68253517b80bd3bfd17eb4e84df6e"
 uuid = "692b3bcd-3c85-4b1f-b108-f13ce0eb3210"
 version = "1.3.0"
 [[JSON]]
 deps = ["Dates", "Mmap", "Parsers", "Unicode"]
 git-tree-sha1 = "8076680b162ada2a031f707ac7b4953e30667a37"
 uuid = "682c06a0-de6a-54ab-a142-c8b1cf79cde6"
 version = "0.21.2"
 [[JSONSchema]]
 deps = ["HTTP", "JSON", "URIs"]
 git-tree-sha1 = "2f49f7f86762a0fbbeef84912265a1ae61c4ef80"
 uuid = "7d188eb4-7ad8-530c-ae41-71a32a6d4692"
 version = "0.3.4"
 [[JuMP]]
 deps = ["Calculus", "DataStructures", "ForwardDiff", "JSON", "LinearAlgebra", "MathOptInterface", "MutableArithmetics", "NaNMath", "Printf", "Random", "SparseArrays", "SpecialFunctions", "Statistics"]
 git-tree-sha1 = "4358b7cbf2db36596bdbbe3becc6b9d87e4eb8f5"
 uuid = "4076af6c-e467-56ae-b986-b466b2749572"
 version = "0.21.10"
 [[LibCURL]]
 deps = ["LibCURL_jll", "MozillaCACerts_jll"]
 uuid = "b27032c2-a3e7-50c8-80cd-2d36dbcbfd21"
 [[LibCURL_jll]]
 deps = ["Artifacts", "LibSSH2_jll", "Libdl", "MbedTLS_jll", "Zlib_jll", "nghttp2_jll"]
 uuid = "deac9b47-8bc7-5906-a0fe-35ac56dc84c0"
 [[LibGit2]]
 deps = ["Base64", "NetworkOptions", "Printf", "SHA"]
 uuid = "76f85450-5226-5b5a-8eaa-529ad045b433"
 [[LibSSH2_jll]]
 deps = ["Artifacts", "Libdl", "MbedTLS_jll"]
 uuid = "29816b5a-b9ab-546f-933c-edad1886dfa8"
 [[Libdl]]
 uuid = "8f399da3-3557-5675-b5ff-fb832c97cbdb"
 [[LinearAlgebra]]
 deps = ["Libdl"]
 uuid = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
 [[LogExpFunctions]]
 deps = ["ChainRulesCore", "DocStringExtensions", "IrrationalConstants", "LinearAlgebra"]
 git-tree-sha1 = "34dc30f868e368f8a17b728a1238f3fcda43931a"
 uuid = "2ab3a3ac-af41-5b50-aa03-7779005ae688"
 version = "0.3.3"
 [[Logging]]
 uuid = "56ddb016-857b-54e1-b83d-db4d58db5568"
 [[METIS_jll]]
 deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"]
 git-tree-sha1 = "2dc1a9fc87e57e32b1fc186db78811157b30c118"
 uuid = "d00139f3-1899-568f-a2f0-47f597d42d70"
 version = "5.1.0+5"
 [[MIPLearn]]
 deps = ["CSV", "Cbc", "Clp", "Conda", "DataFrames", "DataStructures", "Distributed", "JLD2", "JSON", "JuMP", "Logging", "MathOptInterface", "OrderedCollections", "PackageCompiler", "Printf", "ProgressBars", "PyCall", "Random", "SparseArrays", "Statistics", "TimerOutputs"]
 path = "/home/axavier/Packages/MIPLearn.jl/dev/"
 uuid = "2b1277c3-b477-4c49-a15e-7ba350325c68"
 version = "0.2.0"
 [[MUMPS_seq_jll]]
 deps = ["Artifacts", "CompilerSupportLibraries_jll", "JLLWrappers", "Libdl", "METIS_jll", "OpenBLAS32_jll", "Pkg"]
 git-tree-sha1 = "a1d469a2a0acbfe219ef9bdfedae97daacac5a0e"
 uuid = "d7ed1dd3-d0ae-5e8e-bfb4-87a502085b8d"
 version = "5.4.0+0"
 [[MacroTools]]
 deps = ["Markdown", "Random"]
 git-tree-sha1 = "5a5bc6bf062f0f95e62d0fe0a2d99699fed82dd9"
 uuid = "1914dd2f-81c6-5fcd-8719-6d5c9610ff09"
 version = "0.5.8"
 [[Markdown]]
 deps = ["Base64"]
 uuid = "d6f4376e-aef5-505a-96c1-9c027394607a"
 [[MathOptInterface]]
 deps = ["BenchmarkTools", "CodecBzip2", "CodecZlib", "JSON", "JSONSchema", "LinearAlgebra", "MutableArithmetics", "OrderedCollections", "SparseArrays", "Test", "Unicode"]
 git-tree-sha1 = "575644e3c05b258250bb599e57cf73bbf1062901"
 uuid = "b8f27783-ece8-5eb3-8dc8-9495eed66fee"
 version = "0.9.22"
 [[MbedTLS]]
 deps = ["Dates", "MbedTLS_jll", "Random", "Sockets"]
 git-tree-sha1 = "1c38e51c3d08ef2278062ebceade0e46cefc96fe"
 uuid = "739be429-bea8-5141-9913-cc70e7f3736d"
 version = "1.0.3"
 [[MbedTLS_jll]]
 deps = ["Artifacts", "Libdl"]
 uuid = "c8ffd9c3-330d-5841-b78e-0817d7145fa1"
 [[Missings]]
 deps = ["DataAPI"]
 git-tree-sha1 = "bf210ce90b6c9eed32d25dbcae1ebc565df2687f"
 uuid = "e1d29d7a-bbdc-5cf2-9ac0-f12de2c33e28"
 version = "1.0.2"
 [[Mmap]]
 uuid = "a63ad114-7e13-5084-954f-fe012c677804"
 [[MozillaCACerts_jll]]
 uuid = "14a3606d-f60d-562e-9121-12d972cd8159"
 [[MutableArithmetics]]
 deps = ["LinearAlgebra", "SparseArrays", "Test"]
 git-tree-sha1 = "3927848ccebcc165952dc0d9ac9aa274a87bfe01"
 uuid = "d8a4904e-b15c-11e9-3269-09a3773c0cb0"
 version = "0.2.20"
 [[NaNMath]]
 git-tree-sha1 = "bfe47e760d60b82b66b61d2d44128b62e3a369fb"
 uuid = "77ba4419-2d1f-58cd-9bb1-8ffee604a2e3"
 version = "0.3.5"
 [[NetworkOptions]]
 uuid = "ca575930-c2e3-43a9-ace4-1e988b2c1908"
 [[OpenBLAS32_jll]]
 deps = ["Artifacts", "CompilerSupportLibraries_jll", "JLLWrappers", "Libdl", "Pkg"]
 git-tree-sha1 = "ba4a8f683303c9082e84afba96f25af3c7fb2436"
 uuid = "656ef2d0-ae68-5445-9ca0-591084a874a2"
 version = "0.3.12+1"
 [[OpenSpecFun_jll]]
 deps = ["Artifacts", "CompilerSupportLibraries_jll", "JLLWrappers", "Libdl", "Pkg"]
 git-tree-sha1 = "13652491f6856acfd2db29360e1bbcd4565d04f1"
 uuid = "efe28fd5-8261-553b-a9e1-b2916fc3738e"
 version = "0.5.5+0"
 [[OrderedCollections]]
 git-tree-sha1 = "85f8e6578bf1f9ee0d11e7bb1b1456435479d47c"
 uuid = "bac558e1-5e72-5ebc-8fee-abe8a469f55d"
 version = "1.4.1"
 [[Osi_jll]]
 deps = ["Artifacts", "CoinUtils_jll", "CompilerSupportLibraries_jll", "JLLWrappers", "Libdl", "OpenBLAS32_jll", "Pkg"]
 git-tree-sha1 = "6a9967c4394858f38b7fc49787b983ba3847e73d"
 uuid = "7da25872-d9ce-5375-a4d3-7a845f58efdd"
 version = "0.108.6+2"
 [[PDMats]]
 deps = ["LinearAlgebra", "SparseArrays", "SuiteSparse"]
 git-tree-sha1 = "4dd403333bcf0909341cfe57ec115152f937d7d8"
 uuid = "90014a1f-27ba-587c-ab20-58faa44d9150"
 version = "0.11.1"
 [[PackageCompiler]]
 deps = ["Libdl", "Pkg", "UUIDs"]
 git-tree-sha1 = "b8283f57d58e224ce8544934491e389bebdc720c"
 uuid = "9b87118b-4619-50d2-8e1e-99f35a4d4d9d"
 version = "1.5.0"
 [[Parsers]]
 deps = ["Dates"]
 git-tree-sha1 = "438d35d2d95ae2c5e8780b330592b6de8494e779"
 uuid = "69de0a69-1ddd-5017-9359-2bf0b02dc9f0"
 version = "2.0.3"
 [[Pkg]]
 deps = ["Artifacts", "Dates", "Downloads", "LibGit2", "Libdl", "Logging", "Markdown", "Printf", "REPL", "Random", "SHA", "Serialization", "TOML", "Tar", "UUIDs", "p7zip_jll"]
 uuid = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f"
 [[PooledArrays]]
 deps = ["DataAPI", "Future"]
 git-tree-sha1 = "a193d6ad9c45ada72c14b731a318bedd3c2f00cf"
 uuid = "2dfb63ee-cc39-5dd5-95bd-886bf059d720"
 version = "1.3.0"
 [[Preferences]]
 deps = ["TOML"]
 git-tree-sha1 = "00cfd92944ca9c760982747e9a1d0d5d86ab1e5a"
 uuid = "21216c6a-2e73-6563-6e65-726566657250"
 version = "1.2.2"
 [[PrettyTables]]
 deps = ["Crayons", "Formatting", "Markdown", "Reexport", "Tables"]
 git-tree-sha1 = "0d1245a357cc61c8cd61934c07447aa569ff22e6"
 uuid = "08abe8d2-0d0c-5749-adfa-8a2ac140af0d"
 version = "1.1.0"
 [[Printf]]
 deps = ["Unicode"]
 uuid = "de0858da-6303-5e67-8744-51eddeeeb8d7"
 [[Profile]]
 deps = ["Printf"]
 uuid = "9abbd945-dff8-562f-b5e8-e1ebf5ef1b79"
 [[ProgressBars]]
 deps = ["Printf"]
 git-tree-sha1 = "938525cc66a4058f6ed75b84acd13a00fbecea11"
 uuid = "49802e3a-d2f1-5c88-81d8-b72133a6f568"
 version = "1.4.0"
 [[PyCall]]
 deps = ["Conda", "Dates", "Libdl", "LinearAlgebra", "MacroTools", "Serialization", "VersionParsing"]
 git-tree-sha1 = "169bb8ea6b1b143c5cf57df6d34d022a7b60c6db"
 uuid = "438e738f-606a-5dbb-bf0a-cddfbfd45ab0"
 version = "1.92.3"
 [[QuadGK]]
 deps = ["DataStructures", "LinearAlgebra"]
 git-tree-sha1 = "78aadffb3efd2155af139781b8a8df1ef279ea39"
 uuid = "1fd47b50-473d-5c70-9696-f719f8f3bcdc"
 version = "2.4.2"
 [[REPL]]
 deps = ["InteractiveUtils", "Markdown", "Sockets", "Unicode"]
 uuid = "3fa0cd96-eef1-5676-8a61-b3b8758bbffb"
 [[Random]]
 deps = ["Serialization"]
 uuid = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
 [[Reexport]]
 git-tree-sha1 = "45e428421666073eab6f2da5c9d310d99bb12f9b"
 uuid = "189a3867-3050-52da-a836-e630ba90ab69"
 version = "1.2.2"
 [[Requires]]
 deps = ["UUIDs"]
 git-tree-sha1 = "4036a3bd08ac7e968e27c203d45f5fff15020621"
 uuid = "ae029012-a4dd-5104-9daa-d747884805df"
 version = "1.1.3"
 [[Rmath]]
 deps = ["Random", "Rmath_jll"]
 git-tree-sha1 = "bf3188feca147ce108c76ad82c2792c57abe7b1f"
 uuid = "79098fc4-a85e-5d69-aa6a-4863f24498fa"
 version = "0.7.0"
 [[Rmath_jll]]
 deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"]
 git-tree-sha1 = "68db32dff12bb6127bac73c209881191bf0efbb7"
 uuid = "f50d1b31-88e8-58de-be2c-1cc44531875f"
 version = "0.3.0+0"
 [[SHA]]
 uuid = "ea8e919c-243c-51af-8825-aaa63cd721ce"
 [[SentinelArrays]]
 deps = ["Dates", "Random"]
 git-tree-sha1 = "54f37736d8934a12a200edea2f9206b03bdf3159"
 uuid = "91c51154-3ec4-41a3-a24f-3f23e20d615c"
 version = "1.3.7"
 [[Serialization]]
 uuid = "9e88b42a-f829-5b0c-bbe9-9e923198166b"
 [[SharedArrays]]
 deps = ["Distributed", "Mmap", "Random", "Serialization"]
 uuid = "1a1011a3-84de-559e-8e89-a11a2f7dc383"
 [[Sockets]]
 uuid = "6462fe0b-24de-5631-8697-dd941f90decc"
 [[SortingAlgorithms]]
 deps = ["DataStructures"]
 git-tree-sha1 = "b3363d7460f7d098ca0912c69b082f75625d7508"
 uuid = "a2af1166-a08f-5f64-846c-94a0d3cef48c"
 version = "1.0.1"
 [[SparseArrays]]
 deps = ["LinearAlgebra", "Random"]
 uuid = "2f01184e-e22b-5df5-ae63-d93ebab69eaf"
 [[SpecialFunctions]]
 deps = ["ChainRulesCore", "LogExpFunctions", "OpenSpecFun_jll"]
 git-tree-sha1 = "a322a9493e49c5f3a10b50df3aedaf1cdb3244b7"
 uuid = "276daf66-3868-5448-9aa4-cd146d93841b"
 version = "1.6.1"
 [[StaticArrays]]
 deps = ["LinearAlgebra", "Random", "Statistics"]
 git-tree-sha1 = "3240808c6d463ac46f1c1cd7638375cd22abbccb"
 uuid = "90137ffa-7385-5640-81b9-e52037218182"
 version = "1.2.12"
 [[Statistics]]
 deps = ["LinearAlgebra", "SparseArrays"]
 uuid = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"
 [[StatsAPI]]
 git-tree-sha1 = "1958272568dc176a1d881acb797beb909c785510"
 uuid = "82ae8749-77ed-4fe6-ae5f-f523153014b0"
 version = "1.0.0"
 [[StatsBase]]
 deps = ["DataAPI", "DataStructures", "LinearAlgebra", "Missings", "Printf", "Random", "SortingAlgorithms", "SparseArrays", "Statistics", "StatsAPI"]
 git-tree-sha1 = "8cbbc098554648c84f79a463c9ff0fd277144b6c"
 uuid = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"
 version = "0.33.10"
 [[StatsFuns]]
 deps = ["ChainRulesCore", "IrrationalConstants", "LogExpFunctions", "Reexport", "Rmath", "SpecialFunctions"]
 git-tree-sha1 = "46d7ccc7104860c38b11966dd1f72ff042f382e4"
 uuid = "4c63d2b9-4356-54db-8cca-17b64c39e42c"
 version = "0.9.10"
 [[SuiteSparse]]
 deps = ["Libdl", "LinearAlgebra", "Serialization", "SparseArrays"]
 uuid = "4607b0f0-06f3-5cda-b6b1-a6196a1729e9"
 [[TOML]]
 deps = ["Dates"]
 uuid = "fa267f1f-6049-4f14-aa54-33bafae1ed76"
 [[TableTraits]]
 deps = ["IteratorInterfaceExtensions"]
 git-tree-sha1 = "c06b2f539df1c6efa794486abfb6ed2022561a39"
 uuid = "3783bdb8-4a98-5b6b-af9a-565f29a5fe9c"
 version = "1.0.1"
 [[Tables]]
 deps = ["DataAPI", "DataValueInterfaces", "IteratorInterfaceExtensions", "LinearAlgebra", "TableTraits", "Test"]
 git-tree-sha1 = "1162ce4a6c4b7e31e0e6b14486a6986951c73be9"
 uuid = "bd369af6-aec1-5ad0-b16a-f7cc5008161c"
 version = "1.5.2"
 [[Tar]]
 deps = ["ArgTools", "SHA"]
 uuid = "a4e569a6-e804-4fa4-b0f3-eef7a1d5b13e"
 [[Test]]
 deps = ["InteractiveUtils", "Logging", "Random", "Serialization"]
 uuid = "8dfed614-e22c-5e08-85e1-65c5234f0b40"
 [[TimerOutputs]]
 deps = ["ExprTools", "Printf"]
 git-tree-sha1 = "209a8326c4f955e2442c07b56029e88bb48299c7"
 uuid = "a759f4b9-e2f1-59dc-863e-4aeb61b1ea8f"
 version = "0.5.12"
 [[TranscodingStreams]]
 deps = ["Random", "Test"]
 git-tree-sha1 = "216b95ea110b5972db65aa90f88d8d89dcb8851c"
 uuid = "3bb67fe8-82b1-5028-8e26-92a6c54297fa"
 version = "0.9.6"
 [[URIs]]
 git-tree-sha1 = "97bbe755a53fe859669cd907f2d96aee8d2c1355"
 uuid = "5c2747f8-b7ea-4ff2-ba2e-563bfd36b1d4"
 version = "1.3.0"
 [[UUIDs]]
 deps = ["Random", "SHA"]
 uuid = "cf7118a7-6976-5b1a-9a39-7adc72f591a4"
 [[Unicode]]
 uuid = "4ec0a83e-493e-50e2-b9ac-8f72acf5a8f5"
 [[UnitCommitment]]
 deps = ["DataStructures", "Distributed", "Distributions", "GZip", "JSON", "JuMP", "LinearAlgebra", "Logging", "MathOptInterface", "PackageCompiler", "Printf", "Random", "SparseArrays"]
 path = "/home/axavier/Packages/UnitCommitment/dev/"
 uuid = "64606440-39ea-11e9-0f29-3303a1d3d877"
 version = "0.2.2"
 [[VersionParsing]]
 git-tree-sha1 = "80229be1f670524750d905f8fc8148e5a8c4537f"
 uuid = "81def892-9a0e-5fdd-b105-ffc91e053289"
 version = "1.2.0"
 [[WeakRefStrings]]
 deps = ["DataAPI", "Parsers"]
 git-tree-sha1 = "4a4cfb1ae5f26202db4f0320ac9344b3372136b0"
 uuid = "ea10d353-3f73-51f8-a26c-33c1cb351aa5"
 version = "1.3.0"
 [[Zlib_jll]]
 deps = ["Libdl"]
 uuid = "83775a58-1f1d-513f-b197-d71354ab007a"
 [[nghttp2_jll]]
 deps = ["Artifacts", "Libdl"]
 uuid = "8e850ede-7688-5339-a07c-302acd2aaf8d"
 [[p7zip_jll]]
 deps = ["Artifacts", "Libdl"]
 uuid = "3f19e933-33d8-53b3-aaab-bd5110c3b7a0"
--- a/docs/benchmarks/Project.toml
+++ b/docs/benchmarks/Project.toml
@@ -1,5 +0,0 @@
 [deps]
 Gurobi = "2e9cd046-0924-5485-92f1-d5272153d98b"
 JuMP = "4076af6c-e467-56ae-b986-b466b2749572"
 MIPLearn = "2b1277c3-b477-4c49-a15e-7ba350325c68"
 UnitCommitment = "64606440-39ea-11e9-0f29-3303a1d3d877"
--- a/docs/benchmarks/Untitled.ipynb
+++ b/docs/benchmarks/Untitled.ipynb
@@ -1,68 +0,0 @@
 {
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "1ab068f7",
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "df = pd.read_csv(\"/tmp/jl_depmrX\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "e64d7608",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0    True\n",
       "1    True\n",
       "2    True\n",
       "Name: mip_sense, dtype: bool"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df[\"mip_sense\"] == \"min\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b204b538",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
 }
--- a/docs/benchmarks/facility.ipynb
+++ b/docs/benchmarks/facility.ipynb
@@ -1,29 +0,0 @@
 {
 "cells": [
  {
   "cell_type": "markdown",
   "id": "792bbfa2",
   "metadata": {},
   "source": [
    "# Facility Location\n",
    "\n",
    "TODO"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Julia 1.6.0",
   "language": "julia",
   "name": "julia-1.6"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "1.6.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
 }
--- a/docs/benchmarks/gurobi.env
+++ b/docs/benchmarks/gurobi.env
@@ -1,3 +0,0 @@
 OutputFlag 1
 Threads 1
 TimeLimit 3600
--- a/docs/benchmarks/knapsack.ipynb
+++ b/docs/benchmarks/knapsack.ipynb
--- a/docs/benchmarks/preliminaries.ipynb
+++ b/docs/benchmarks/preliminaries.ipynb
@@ -1,51 +0,0 @@
 {
 "cells": [
  {
   "cell_type": "markdown",
   "id": "cf77634b",
   "metadata": {},
   "source": [
    "# Preliminaries\n",
    "\n",
    "## Benchmark challenges\n",
    "\n",
    "When evaluating the performance of a conventional MIP solver, *benchmark sets*, such as MIPLIB and TSPLIB, are typically used. The performance of newly proposed solvers or solution techniques are typically measured as the average (or total) running time the solver takes to solve the entire benchmark set. For Learning-Enhanced MIP solvers, it is also necessary to specify what instances should the solver be trained on (the *training instances*) before solving the actual set of instances we are interested in (the *test instances*). If the training instances are very similar to the test instances, we would expect a Learning-Enhanced Solver to present stronger perfomance benefits.\n",
    "\n",
    "In MIPLearn, each optimization problem comes with a set of **benchmark challenges**, which specify how should the training and test instances be generated. The first challenges are typically easier, in the sense that training and test instances are very similar. Later challenges gradually make the sets more distinct, and therefore harder to learn from.\n",
    "\n",
    "## Baseline results\n",
    "\n",
    "To illustrate the performance of `LearningSolver`, and to set a baseline for newly proposed techniques, we present in this page, for each benchmark challenge, a small set of computational results measuring the solution speed of the solver and the solution quality with default parameters. For more detailed computational studies, see [references](index.md#references). We compare three solvers:\n",
    "\n",
    "* **baseline:** Gurobi 9.1 with default settings (a conventional state-of-the-art MIP solver)\n",
    "* **ml-exact:** `LearningSolver` with default settings, using Gurobi as internal MIP solver\n",
    "* **ml-heuristic:** Same as above, but with `mode=\"heuristic\"`\n",
    "\n",
    "All experiments presented here were performed on a Linux workstation (Ubuntu Linux 20.04 LTS) with AMD Ryzen 3950X (16 cores, 32 threads) and 64 GB RAM (DDR4, 3200 MHz). All solvers were restricted to use a single thread, 3600 second time limit, and 16 instances were solved simultaneously at a time."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4f4597c3",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Julia 1.6.0",
   "language": "julia",
   "name": "julia-1.6"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "1.6.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
 }
--- a/docs/benchmarks/stab.ipynb
+++ b/docs/benchmarks/stab.ipynb
--- a/docs/benchmarks/tsp.ipynb
+++ b/docs/benchmarks/tsp.ipynb
--- a/docs/benchmarks/uc.ipynb
+++ b/docs/benchmarks/uc.ipynb
@@ -1,151 +0,0 @@
 {
 "cells": [
  {
   "cell_type": "markdown",
   "id": "b1b21dfb",
   "metadata": {},
   "source": [
    "# Unit Commitment\n",
    "\n",
    "TODO"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "5c0dec00",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\u001b[32m\u001b[1m  Activating\u001b[22m\u001b[39m environment at `~/Packages/MIPLearn/dev/docs/benchmarks/Project.toml`\n"
     ]
    }
   ],
   "source": [
    "using Distributed\n",
    "# addprocs(17 - nprocs())\n",
    "@everywhere using Pkg\n",
    "@everywhere Pkg.activate(\".\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "5da58a62",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "┌ Info: Precompiling MIPLearn [2b1277c3-b477-4c49-a15e-7ba350325c68]\n",
      "└ @ Base loading.jl:1317\n"
     ]
    }
   ],
   "source": [
    "@everywhere using MIPLearn\n",
    "@everywhere using MIPLearn.BB\n",
    "@everywhere using UnitCommitment\n",
    "@everywhere using Gurobi\n",
    "@everywhere using Random\n",
    "@everywhere using JuMP\n",
    "@everywhere using Logging\n",
    "@everywhere import UnitCommitment: XavQiuAhm2021\n",
    "@everywhere Logging.disable_logging(Logging.Warn);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "444dbbb1",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "0.0%┣                                               ┫ 0/45 [00:00<00:-2, -0s/it]\n",
      "2.2%┣█                                          ┫ 1/45 [00:42<Inf:Inf, InfGs/it]\n",
      "WARNING: Force throwing a SIGINT\n"
     ]
    }
   ],
   "source": [
    "@everywhere Base.@kwdef struct UnitCommitmentData\n",
    "    case::String\n",
    "    hours::Int\n",
    "    seed::Int\n",
    "end\n",
    "\n",
    "@everywhere function build_uc_model(data::UnitCommitmentData)\n",
    "    instance = UnitCommitment.slice(\n",
    "        UnitCommitment.read_benchmark(\"matpower/$(data.case)/2017-02-01\"),\n",
    "        1:data.hours,\n",
    "    )\n",
    "    Random.seed!(data.seed)\n",
    "    UnitCommitment.randomize!(\n",
    "        instance,\n",
    "        XavQiuAhm2021.Randomization(),\n",
    "    )\n",
    "    model = UnitCommitment.build_model(\n",
    "        instance = instance,\n",
    "        variable_names = true,\n",
    "    )\n",
    "    @lazycb(\n",
    "        model,\n",
    "        UnitCommitment.find_lazy,\n",
    "        UnitCommitment.enforce_lazy,\n",
    "    )\n",
    "    return model\n",
    "end\n",
    "\n",
    "instances = MIPLearn.save(\n",
    "    [\n",
    "        UnitCommitmentData(\n",
    "            case = \"case1888rte\",\n",
    "            hours = 4,\n",
    "            seed = i,\n",
    "        )\n",
    "        for i in 1:50\n",
    "    ],\n",
    "    \"uc\",\n",
    ")\n",
    "\n",
    "MIPLearn.run_benchmarks(\n",
    "    optimizer = Gurobi.Optimizer,\n",
    "    train_instances = instances[1:45],\n",
    "    test_instances = instances[46:50],\n",
    "    build_model = build_uc_model,\n",
    "    progress = true,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2caac26f",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Julia 1.6.0",
   "language": "julia",
   "name": "julia-1.6"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "1.6.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
 }
--- a/docs/conf.py
+++ b/docs/conf.py
@@ -1,8 +1,14 @@
 project = "MIPLearn"
-copyright = "2020-2021, UChicago Argonne, LLC"
+copyright = "2020-2023, UChicago Argonne, LLC"
 author = ""
-release = "0.2"
+release = "0.3"
-extensions = ["myst_parser", "nbsphinx"]
+extensions = [
    "myst_parser",
    "nbsphinx",
    "sphinx_multitoc_numbering",
    "sphinx.ext.autodoc",
    "sphinx.ext.napoleon",
 ]
 templates_path = ["_templates"]
 exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"]
 html_theme = "sphinx_book_theme"
--- a/docs/guide/collectors.ipynb
+++ b/docs/guide/collectors.ipynb
@@ -0,0 +1,277 @@
 {
 "cells": [
  {
   "cell_type": "markdown",
   "id": "505cea0b-5f5d-478a-9107-42bb5515937d",
   "metadata": {},
   "source": [
    "# Training Data Collectors\n",
    "The first step in solving mixed-integer optimization problems with the assistance of supervised machine learning methods is solving a large set of training instances and collecting the raw training data. In this section, we describe the various training data collectors included in MIPLearn. Additionally, the framework follows the convention of storing all training data in files with a specific data format (namely, HDF5). In this section, we briefly describe this format and the rationale for choosing it.\n",
    "\n",
    "## Overview\n",
    "\n",
    "In MIPLearn, a **collector** is a class that solves or analyzes the problem and collects raw data which may be later useful for machine learning methods. Collectors, by convention, take as input: (i) a list of problem data filenames, in gzipped pickle format, ending with `.pkl.gz`; (ii) a function that builds the optimization model, such as `build_tsp_model`. After processing is done, collectors store the training data in a HDF5 file located alongside with the problem data. For example, if the problem data is stored in file `problem.pkl.gz`, then the collector writes to `problem.h5`. Collectors are, in general, very time consuming, as they may need to solve the problem to optimality, potentially multiple times.\n",
    "\n",
    "## HDF5 Format\n",
    "\n",
    "MIPLearn stores all training data in [HDF5](HDF5) (Hierarchical Data Format, Version 5) files. The HDF format was originally developed by the [National Center for Supercomputing Applications][NCSA] (NCSA) for storing and organizing large amounts of data, and supports a variety of data types, including integers, floating-point numbers, strings, and arrays. Compared to other formats, such as CSV, JSON or SQLite, the HDF5 format provides several advantages for MIPLearn, including:\n",
    "\n",
    "- *Storage of multiple scalars, vectors and matrices in a single file* --- This allows MIPLearn to store all training data related to a given problem instance in a single file, which makes training data easier to store, organize and transfer.\n",
    "- *High-performance partial I/O* --- Partial I/O allows MIPLearn to read a single element from the training data (e.g. value of the optimal solution) without loading the entire file to memory or reading it from beginning to end, which dramatically improves performance and reduces memory requirements. This is especially important when processing a large number of training data files.\n",
    "- *On-the-fly compression* --- HDF5 files can be transparently compressed, using the gzip method, which reduces storage requirements and accelerates network transfers.\n",
    "- *Stable, portable and well-supported data format* --- Training data files are typically expensive to generate. Having a stable and well supported data format ensures that these files remain usable in the future, potentially even by other non-Python MIP/ML frameworks.\n",
    "\n",
    "MIPLearn currently uses HDF5 as simple key-value storage for numerical data; more advanced features of the format, such as metadata, are not currently used. Although files generated by MIPLearn can be read with any HDF5 library, such as [h5py][h5py], some convenience functions are provided to make the access more simple and less error-prone. Specifically, the class [H5File][H5File], which is built on top of h5py, provides the methods [put_scalar][put_scalar], [put_array][put_array], [put_sparse][put_sparse], [put_bytes][put_bytes] to store, respectively, scalar values, dense multi-dimensional arrays, sparse multi-dimensional arrays and arbitrary binary data. The corresponding *get* methods are also provided. Compared to pure h5py methods, these methods automatically perform type-checking and gzip compression. The example below shows their usage.\n",
    "\n",
    "[HDF5]: https://en.wikipedia.org/wiki/Hierarchical_Data_Format\n",
    "[NCSA]: https://en.wikipedia.org/wiki/National_Center_for_Supercomputing_Applications\n",
    "[h5py]: https://www.h5py.org/\n",
    "[H5File]: ../../api/helpers/#miplearn.h5.H5File\n",
    "[put_scalar]: ../../api/helpers/#miplearn.h5.H5File.put_scalar\n",
    "[put_array]: ../../api/helpers/#miplearn.h5.H5File.put_scalar\n",
    "[put_sparse]: ../../api/helpers/#miplearn.h5.H5File.put_scalar\n",
    "[put_bytes]: ../../api/helpers/#miplearn.h5.H5File.put_scalar\n",
    "\n",
    "\n",
    "### Example"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "f906fe9c",
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "x1 = 1\n",
      "x2 = hello world\n",
      "x3 = [1 2 3]\n",
      "x4 = [[0.37454012 0.9507143  0.7319939 ]\n",
      " [0.5986585  0.15601864 0.15599452]\n",
      " [0.05808361 0.8661761  0.601115  ]]\n",
      "x5 =   (2, 3)\t0.68030757\n",
      "  (3, 2)\t0.45049927\n",
      "  (4, 0)\t0.013264962\n",
      "  (0, 2)\t0.94220173\n",
      "  (4, 2)\t0.5632882\n",
      "  (2, 1)\t0.3854165\n",
      "  (1, 1)\t0.015966251\n",
      "  (3, 0)\t0.23089382\n",
      "  (4, 4)\t0.24102546\n",
      "  (1, 3)\t0.68326354\n",
      "  (3, 1)\t0.6099967\n",
      "  (0, 3)\t0.8331949\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "import scipy.sparse\n",
    "\n",
    "from miplearn.h5 import H5File\n",
    "\n",
    "# Set random seed to make example reproducible\n",
    "np.random.seed(42)\n",
    "\n",
    "# Create a new empty HDF5 file\n",
    "with H5File(\"test.h5\", \"w\") as h5:\n",
    "    # Store a scalar\n",
    "    h5.put_scalar(\"x1\", 1)\n",
    "    h5.put_scalar(\"x2\", \"hello world\")\n",
    "\n",
    "    # Store a dense array and a dense matrix\n",
    "    h5.put_array(\"x3\", np.array([1, 2, 3]))\n",
    "    h5.put_array(\"x4\", np.random.rand(3, 3))\n",
    "\n",
    "    # Store a sparse matrix\n",
    "    h5.put_sparse(\"x5\", scipy.sparse.random(5, 5, 0.5))\n",
    "\n",
    "# Re-open the file we just created and print\n",
    "# previously-stored data\n",
    "with H5File(\"test.h5\", \"r\") as h5:\n",
    "    print(\"x1 =\", h5.get_scalar(\"x1\"))\n",
    "    print(\"x2 =\", h5.get_scalar(\"x2\"))\n",
    "    print(\"x3 =\", h5.get_array(\"x3\"))\n",
    "    print(\"x4 =\", h5.get_array(\"x4\"))\n",
    "    print(\"x5 =\", h5.get_sparse(\"x5\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "50441907",
   "metadata": {},
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "d0000c8d",
   "metadata": {},
   "source": [
    "## Basic collector\n",
    "\n",
    "[BasicCollector][BasicCollector] is the most fundamental collector, and performs the following steps:\n",
    "\n",
    "1. Extracts all model data, such as objective function and constraint right-hand sides into numpy arrays, which can later be easily and efficiently accessed without rebuilding the model or invoking the solver;\n",
    "2. Solves the linear relaxation of the problem and stores its optimal solution, basis status and sensitivity information, among other information;\n",
    "3. Solves the original mixed-integer optimization problem to optimality and stores its optimal solution, along with solve statistics, such as number of explored nodes and wallclock time.\n",
    "\n",
    "Data extracted in Phases 1, 2 and 3 above are prefixed, respectively as `static_`, `lp_` and `mip_`. The entire set of fields is shown in the table below.\n",
    "\n",
    "[BasicCollector]: ../../api/collectors/#miplearn.collectors.basic.BasicCollector\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6529f667",
   "metadata": {},
   "source": [
    "### Data fields\n",
    "\n",
    "| Field                             | Type                | Description                                                                                                                                 |\n",
    "|-----------------------------------|---------------------|---------------------------------------------------------------------------------------------------------------------------------------------|\n",
    "| `static_constr_lhs`               | `(nconstrs, nvars)` | Constraint left-hand sides, in sparse matrix format                                                                                         |\n",
    "| `static_constr_names`             | `(nconstrs,)`       | Constraint names                                                                                                                            |\n",
    "| `static_constr_rhs`               | `(nconstrs,)`       | Constraint right-hand sides                                                                                                                 |\n",
    "| `static_constr_sense`             | `(nconstrs,)`       | Constraint senses (`\"<\"`, `\">\"` or `\"=\"`)                                                                                                   |\n",
    "| `static_obj_offset`               | `float`             | Constant value added to the objective function                                                                                              |\n",
    "| `static_sense`                    | `str`               | `\"min\"` if minimization problem or `\"max\"` otherwise                                                                                        |\n",
    "| `static_var_lower_bounds`         | `(nvars,)`          | Variable lower bounds                                                                                                                       |\n",
    "| `static_var_names`                | `(nvars,)`          | Variable names                                                                                                                              |\n",
    "| `static_var_obj_coeffs`           | `(nvars,)`          | Objective coefficients                                                                                                                      |\n",
    "| `static_var_types`                | `(nvars,)`          | Types of the decision variables (`\"C\"`, `\"B\"` and `\"I\"` for continuous, binary and integer, respectively)                                   |\n",
    "| `static_var_upper_bounds`         | `(nvars,)`          | Variable upper bounds                                                                                                                       |\n",
    "| `lp_constr_basis_status`          | `(nconstr,)`        | Constraint basis status (`0` for basic, `-1` for non-basic)                                                                                 |\n",
    "| `lp_constr_dual_values`           | `(nconstr,)`        | Constraint dual value (or shadow price)                                                                                                     |\n",
    "| `lp_constr_sa_rhs_{up,down}`      | `(nconstr,)`        | Sensitivity information for the constraint RHS                                                                                              |\n",
    "| `lp_constr_slacks`                | `(nconstr,)`        | Constraint slack in the solution to the LP relaxation                                                                                       |\n",
    "| `lp_obj_value`                    | `float`             | Optimal value of the LP relaxation                                                                                                          |\n",
    "| `lp_var_basis_status`             | `(nvars,)`          | Variable basis status (`0`, `-1`, `-2` or `-3` for basic, non-basic at lower bound, non-basic at upper bound, and superbasic, respectively) |\n",
    "| `lp_var_reduced_costs`            | `(nvars,)`          | Variable reduced costs                                                                                                                      |\n",
    "| `lp_var_sa_{obj,ub,lb}_{up,down}` | `(nvars,)`          | Sensitivity information for the variable objective coefficient, lower and upper bound.                                                      |\n",
    "| `lp_var_values`                   | `(nvars,)`          | Optimal solution to the LP relaxation                                                                                                       |\n",
    "| `lp_wallclock_time`               | `float`             | Time taken to solve the LP relaxation (in seconds)                                                                                          |\n",
    "| `mip_constr_slacks`               | `(nconstrs,)`       | Constraint slacks in the best MIP solution                                                                                                  |\n",
    "| `mip_gap`                         | `float`             | Relative MIP optimality gap                                                                                                                 |\n",
    "| `mip_node_count`                  | `float`             | Number of explored branch-and-bound nodes                                                                                                   |\n",
    "| `mip_obj_bound`                   | `float`             | Dual bound                                                                                                                                  |\n",
    "| `mip_obj_value`                   | `float`             | Value of the best MIP solution                                                                                                              |\n",
    "| `mip_var_values`                  | `(nvars,)`          | Best MIP solution                                                                                                                           |\n",
    "| `mip_wallclock_time`              | `float`             | Time taken to solve the MIP (in seconds)                                                                                                    |"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f2894594",
   "metadata": {},
   "source": [
    "### Example\n",
    "\n",
    "The example below shows how to generate a few random instances of the traveling salesman problem, store its problem data, run the collector and print some of the training data to screen."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "ac6f8c6f",
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "lp_obj_value =  2909.0\n",
      "mip_obj_value =  2921.0\n"
     ]
    }
   ],
   "source": [
    "import random\n",
    "import numpy as np\n",
    "from scipy.stats import uniform, randint\n",
    "from glob import glob\n",
    "\n",
    "from miplearn.problems.tsp import (\n",
    "    TravelingSalesmanGenerator,\n",
    "    build_tsp_model,\n",
    ")\n",
    "from miplearn.io import write_pkl_gz\n",
    "from miplearn.h5 import H5File\n",
    "from miplearn.collectors.basic import BasicCollector\n",
    "\n",
    "# Set random seed to make example reproducible.\n",
    "random.seed(42)\n",
    "np.random.seed(42)\n",
    "\n",
    "# Generate a few instances of the traveling salesman problem.\n",
    "data = TravelingSalesmanGenerator(\n",
    "    n=randint(low=10, high=11),\n",
    "    x=uniform(loc=0.0, scale=1000.0),\n",
    "    y=uniform(loc=0.0, scale=1000.0),\n",
    "    gamma=uniform(loc=0.90, scale=0.20),\n",
    "    fix_cities=True,\n",
    "    round=True,\n",
    ").generate(10)\n",
    "\n",
    "# Save instance data to data/tsp/00000.pkl.gz, data/tsp/00001.pkl.gz, ...\n",
    "write_pkl_gz(data, \"data/tsp\")\n",
    "\n",
    "# Solve all instances and collect basic solution information.\n",
    "# Process at most four instances in parallel.\n",
    "bc = BasicCollector()\n",
    "bc.collect(glob(\"data/tsp/*.pkl.gz\"), build_tsp_model, n_jobs=4)\n",
    "\n",
    "# Read and print some training data for the first instance.\n",
    "with H5File(\"data/tsp/00000.h5\", \"r\") as h5:\n",
    "    print(\"lp_obj_value = \", h5.get_scalar(\"lp_obj_value\"))\n",
    "    print(\"mip_obj_value = \", h5.get_scalar(\"mip_obj_value\"))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "78f0b07a",
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.16"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
 }
--- a/docs/guide/features.ipynb
+++ b/docs/guide/features.ipynb
@@ -0,0 +1,334 @@
 {
 "cells": [
  {
   "cell_type": "markdown",
   "id": "cdc6ebe9-d1d4-4de1-9b5a-4fc8ef57b11b",
   "metadata": {},
   "source": [
    "# Feature Extractors\n",
    "\n",
    "In the previous page, we introduced *training data collectors*, which solve the optimization problem and collect raw training data, such as the optimal solution. In this page, we introduce **feature extractors**, which take the raw training data, stored in HDF5 files, and extract relevant information in order to train a machine learning model."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b4026de5",
   "metadata": {},
   "source": [
    "\n",
    "## Overview\n",
    "\n",
    "Feature extraction is an important step of the process of building a machine learning model because it helps to reduce the complexity of the data and convert it into a format that is more easily processed. Previous research has proposed converting absolute variable coefficients, for example, into relative values which are invariant  to various transformations, such as problem scaling, making them more amenable to learning. Various other transformations have also been described.\n",
    "\n",
    "In the framework, we treat data collection and feature extraction as two separate steps to accelerate the model development cycle. Specifically, collectors are typically time-consuming, as they often need to solve the problem to optimality, and therefore focus on collecting and storing all data that may or may not be relevant, in its raw format. Feature extractors, on the other hand, focus entirely on filtering the data and improving its representation, and are therefore much faster to run. Experimenting with new data representations, therefore, can be done without resolving the instances.\n",
    "\n",
    "In MIPLearn, extractors implement the abstract class [FeatureExtractor][FeatureExtractor], which has methods that take as input an [H5File][H5File] and produce either: (i) instance features, which describe the entire instances; (ii) variable features, which describe a particular decision variables; or (iii) constraint features, which describe a particular constraint. The extractor is free to implement only a subset of these methods, if it is known that it will not be used with a machine learning component that requires the other types of features.\n",
    "\n",
    "[FeatureExtractor]: ../../api/collectors/#miplearn.features.fields.FeaturesExtractor\n",
    "[H5File]: ../../api/helpers/#miplearn.h5.H5File"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b2d9736c",
   "metadata": {},
   "source": [
    "\n",
    "## H5FieldsExtractor\n",
    "\n",
    "[H5FieldsExtractor][H5FieldsExtractor], the most simple extractor in MIPLearn, simple extracts data that is already available in the HDF5 file, assembles it into a matrix and returns it as-is. The fields used to build instance, variable and constraint features are user-specified. The class also performs checks to ensure that the shapes of the returned matrices make sense."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e8184dff",
   "metadata": {},
   "source": [
    "### Example\n",
    "\n",
    "The example below demonstrates the usage of H5FieldsExtractor in a randomly generated instance of the multi-dimensional knapsack problem."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "ed9a18c8",
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "instance features (11,) \n",
      " [-1531.24308771  -350.          -692.          -454.\n",
      "  -709.          -605.          -543.          -321.\n",
      "  -674.          -571.          -341.        ]\n",
      "variable features (10, 4) \n",
      " [[-1.53124309e+03 -3.50000000e+02  0.00000000e+00  9.43468018e+01]\n",
      " [-1.53124309e+03 -6.92000000e+02  2.51703322e-01  0.00000000e+00]\n",
      " [-1.53124309e+03 -4.54000000e+02  0.00000000e+00  8.25504150e+01]\n",
      " [-1.53124309e+03 -7.09000000e+02  1.11373022e-01  0.00000000e+00]\n",
      " [-1.53124309e+03 -6.05000000e+02  1.00000000e+00 -1.26055283e+02]\n",
      " [-1.53124309e+03 -5.43000000e+02  0.00000000e+00  1.68693771e+02]\n",
      " [-1.53124309e+03 -3.21000000e+02  1.07488781e-01  0.00000000e+00]\n",
      " [-1.53124309e+03 -6.74000000e+02  8.82293701e-01  0.00000000e+00]\n",
      " [-1.53124309e+03 -5.71000000e+02  0.00000000e+00  1.41129074e+02]\n",
      " [-1.53124309e+03 -3.41000000e+02  1.28830120e-01  0.00000000e+00]]\n",
      "constraint features (5, 3) \n",
      " [[ 1.3100000e+03 -1.5978307e-01  0.0000000e+00]\n",
      " [ 9.8800000e+02 -3.2881632e-01  0.0000000e+00]\n",
      " [ 1.0040000e+03 -4.0601316e-01  0.0000000e+00]\n",
      " [ 1.2690000e+03 -1.3659772e-01  0.0000000e+00]\n",
      " [ 1.0070000e+03 -2.8800571e-01  0.0000000e+00]]\n"
     ]
    }
   ],
   "source": [
    "from glob import glob\n",
    "from shutil import rmtree\n",
    "\n",
    "import numpy as np\n",
    "from scipy.stats import uniform, randint\n",
    "\n",
    "from miplearn.collectors.basic import BasicCollector\n",
    "from miplearn.extractors.fields import H5FieldsExtractor\n",
    "from miplearn.h5 import H5File\n",
    "from miplearn.io import write_pkl_gz\n",
    "from miplearn.problems.multiknapsack import (\n",
    "    MultiKnapsackGenerator,\n",
    "    build_multiknapsack_model,\n",
    ")\n",
    "\n",
    "# Set random seed to make example reproducible\n",
    "np.random.seed(42)\n",
    "\n",
    "# Generate some random multiknapsack instances\n",
    "rmtree(\"data/multiknapsack/\", ignore_errors=True)\n",
    "write_pkl_gz(\n",
    "    MultiKnapsackGenerator(\n",
    "        n=randint(low=10, high=11),\n",
    "        m=randint(low=5, high=6),\n",
    "        w=uniform(loc=0, scale=1000),\n",
    "        K=uniform(loc=100, scale=0),\n",
    "        u=uniform(loc=1, scale=0),\n",
    "        alpha=uniform(loc=0.25, scale=0),\n",
    "        w_jitter=uniform(loc=0.95, scale=0.1),\n",
    "        p_jitter=uniform(loc=0.75, scale=0.5),\n",
    "        fix_w=True,\n",
    "    ).generate(10),\n",
    "    \"data/multiknapsack\",\n",
    ")\n",
    "\n",
    "# Run the basic collector\n",
    "BasicCollector().collect(\n",
    "    glob(\"data/multiknapsack/*\"),\n",
    "    build_multiknapsack_model,\n",
    "    n_jobs=4,\n",
    ")\n",
    "\n",
    "ext = H5FieldsExtractor(\n",
    "    # Use as instance features the value of the LP relaxation and the\n",
    "    # vector of objective coefficients.\n",
    "    instance_fields=[\n",
    "        \"lp_obj_value\",\n",
    "        \"static_var_obj_coeffs\",\n",
    "    ],\n",
    "    # For each variable, use as features the optimal value of the LP\n",
    "    # relaxation, the variable objective coefficient, the variable's\n",
    "    # value its reduced cost.\n",
    "    var_fields=[\n",
    "        \"lp_obj_value\",\n",
    "        \"static_var_obj_coeffs\",\n",
    "        \"lp_var_values\",\n",
    "        \"lp_var_reduced_costs\",\n",
    "    ],\n",
    "    # For each constraint, use as features the RHS, dual value and slack.\n",
    "    constr_fields=[\n",
    "        \"static_constr_rhs\",\n",
    "        \"lp_constr_dual_values\",\n",
    "        \"lp_constr_slacks\",\n",
    "    ],\n",
    ")\n",
    "\n",
    "with H5File(\"data/multiknapsack/00000.h5\") as h5:\n",
    "    # Extract and print instance features\n",
    "    x1 = ext.get_instance_features(h5)\n",
    "    print(\"instance features\", x1.shape, \"\\n\", x1)\n",
    "\n",
    "    # Extract and print variable features\n",
    "    x2 = ext.get_var_features(h5)\n",
    "    print(\"variable features\", x2.shape, \"\\n\", x2)\n",
    "\n",
    "    # Extract and print constraint features\n",
    "    x3 = ext.get_constr_features(h5)\n",
    "    print(\"constraint features\", x3.shape, \"\\n\", x3)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2da2e74e",
   "metadata": {},
   "source": [
    "\n",
    "[H5FieldsExtractor]: ../../api/collectors/#miplearn.features.fields.H5FieldsExtractor"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d879c0d3",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-warning\">\n",
    "Warning\n",
    "\n",
    "You should ensure that the number of features remains the same for all relevant HDF5 files. In the previous example, to illustrate this issue, we used variable objective coefficients as instance features. While this is allowed, note that this requires all problem instances to have the same number of variables; otherwise the number of features would vary from instance to instance and MIPLearn would be unable to concatenate the matrices.\n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cd0ba071",
   "metadata": {},
   "source": [
    "## AlvLouWeh2017Extractor\n",
    "\n",
    "Alvarez, Louveaux and Wehenkel (2017) proposed a set features to describe a particular decision variable in a given node of the branch-and-bound tree, and applied it to the problem of mimicking strong branching decisions. The class [AlvLouWeh2017Extractor][] implements a subset of these features (40 out of 64), which are available outside of the branch-and-bound tree. Some features are derived from the static defintion of the problem (i.e. from objective function and constraint data), while some features are derived from the solution to the LP relaxation. The features have been designed to be: (i) independent of the size of the problem; (ii) invariant with respect to irrelevant problem transformations, such as row and column permutation; and (iii) independent of the scale of the problem. We refer to the paper for a more complete description.\n",
    "\n",
    "### Example"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "a1bc38fe",
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "x1 (10, 40) \n",
      " [[-1.00e+00  1.00e+20  1.00e-01  1.00e+00  0.00e+00  1.00e+00  6.00e-01\n",
      "   0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00\n",
      "   0.00e+00  1.00e+00  6.00e-01  1.00e+00  1.75e+01  1.00e+00  2.00e-01\n",
      "   1.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00\n",
      "   0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00\n",
      "   0.00e+00  1.00e+00 -1.00e+00  0.00e+00  1.00e+20]\n",
      " [-1.00e+00  1.00e+20  1.00e-01  1.00e+00  1.00e-01  1.00e+00  1.00e+00\n",
      "   0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00\n",
      "   0.00e+00  1.00e+00  7.00e-01  1.00e+00  5.10e+00  1.00e+00  2.00e-01\n",
      "   1.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00\n",
      "   0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00\n",
      "   3.00e-01 -1.00e+00 -1.00e+00  0.00e+00  0.00e+00]\n",
      " [-1.00e+00  1.00e+20  1.00e-01  1.00e+00  0.00e+00  1.00e+00  9.00e-01\n",
      "   0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00\n",
      "   0.00e+00  1.00e+00  5.00e-01  1.00e+00  1.30e+01  1.00e+00  2.00e-01\n",
      "   1.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00\n",
      "   0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00\n",
      "   0.00e+00  1.00e+00 -1.00e+00  0.00e+00  1.00e+20]\n",
      " [-1.00e+00  1.00e+20  1.00e-01  1.00e+00  2.00e-01  1.00e+00  9.00e-01\n",
      "   0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00\n",
      "   0.00e+00  1.00e+00  8.00e-01  1.00e+00  3.40e+00  1.00e+00  2.00e-01\n",
      "   1.00e+00  1.00e-01  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00\n",
      "   0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00\n",
      "   1.00e-01 -1.00e+00 -1.00e+00  0.00e+00  0.00e+00]\n",
      " [-1.00e+00  1.00e+20  1.00e-01  1.00e+00  1.00e-01  1.00e+00  7.00e-01\n",
      "   0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00\n",
      "   0.00e+00  1.00e+00  6.00e-01  1.00e+00  3.80e+00  1.00e+00  2.00e-01\n",
      "   1.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00\n",
      "   0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00\n",
      "   0.00e+00 -1.00e+00 -1.00e+00  0.00e+00  0.00e+00]\n",
      " [-1.00e+00  1.00e+20  1.00e-01  1.00e+00  1.00e-01  1.00e+00  8.00e-01\n",
      "   0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00\n",
      "   0.00e+00  1.00e+00  7.00e-01  1.00e+00  3.30e+00  1.00e+00  2.00e-01\n",
      "   1.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00\n",
      "   0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00\n",
      "   0.00e+00  1.00e+00 -1.00e+00  0.00e+00  1.00e+20]\n",
      " [-1.00e+00  1.00e+20  1.00e-01  1.00e+00  0.00e+00  1.00e+00  3.00e-01\n",
      "   0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00\n",
      "   0.00e+00  1.00e+00  1.00e+00  1.00e+00  5.70e+00  1.00e+00  1.00e-01\n",
      "   1.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00\n",
      "   0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00\n",
      "   1.00e-01 -1.00e+00 -1.00e+00  0.00e+00  0.00e+00]\n",
      " [-1.00e+00  1.00e+20  1.00e-01  1.00e+00  1.00e-01  1.00e+00  6.00e-01\n",
      "   0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00\n",
      "   0.00e+00  1.00e+00  8.00e-01  1.00e+00  6.80e+00  1.00e+00  2.00e-01\n",
      "   1.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00\n",
      "   0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00\n",
      "   1.00e-01 -1.00e+00 -1.00e+00  0.00e+00  0.00e+00]\n",
      " [-1.00e+00  1.00e+20  1.00e-01  1.00e+00  4.00e-01  1.00e+00  6.00e-01\n",
      "   0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00\n",
      "   0.00e+00  1.00e+00  8.00e-01  1.00e+00  1.40e+00  1.00e+00  1.00e-01\n",
      "   1.00e+00  1.00e-01  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00\n",
      "   0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00\n",
      "   0.00e+00  1.00e+00 -1.00e+00  0.00e+00  1.00e+20]\n",
      " [-1.00e+00  1.00e+20  1.00e-01  1.00e+00  0.00e+00  1.00e+00  5.00e-01\n",
      "   0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00\n",
      "   0.00e+00  1.00e+00  5.00e-01  1.00e+00  7.60e+00  1.00e+00  1.00e-01\n",
      "   1.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00\n",
      "   0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00  0.00e+00\n",
      "   1.00e-01 -1.00e+00 -1.00e+00  0.00e+00  0.00e+00]]\n"
     ]
    }
   ],
   "source": [
    "from miplearn.extractors.AlvLouWeh2017 import AlvLouWeh2017Extractor\n",
    "from miplearn.h5 import H5File\n",
    "\n",
    "# Build the extractor\n",
    "ext = AlvLouWeh2017Extractor()\n",
    "\n",
    "# Open previously-created multiknapsack training data\n",
    "with H5File(\"data/multiknapsack/00000.h5\") as h5:\n",
    "    # Extract and print variable features\n",
    "    x1 = ext.get_var_features(h5)\n",
    "    print(\"x1\", x1.shape, \"\\n\", x1.round(1))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "286c9927",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-info\">\n",
    "References\n",
    "\n",
    "* **Alvarez, Alejandro Marcos.** *Computational and theoretical synergies between linear optimization and supervised machine learning.* (2016). University of Liège.\n",
    "* **Alvarez, Alejandro Marcos, Quentin Louveaux, and Louis Wehenkel.** *A machine learning-based approximation of strong branching.* INFORMS Journal on Computing 29.1 (2017): 185-195.\n",
    "\n",
    "</div>"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.16"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
 }
--- a/docs/guide/primal.ipynb
+++ b/docs/guide/primal.ipynb
@@ -0,0 +1,291 @@
 {
 "cells": [
  {
   "cell_type": "markdown",
   "id": "880cf4c7-d3c4-4b92-85c7-04a32264cdae",
   "metadata": {},
   "source": [
    "# Primal Components\n",
    "\n",
    "In MIPLearn, a **primal component** is class that uses machine learning to predict a (potentially partial) assignment of values to the decision variables of the problem. Predicting high-quality primal solutions may be beneficial, as they allow the MIP solver to prune potentially large portions of the search space. Alternatively, if proof of optimality is not required, the MIP solver can be used to complete the partial solution generated by the machine learning model and and double-check its feasibility. MIPLearn allows both of these usage patterns.\n",
    "\n",
    "In this page, we describe the four primal components currently included in MIPLearn, which employ machine learning in different ways. Each component is highly configurable, and accepts an user-provided machine learning model, which it uses for all predictions. Each component can also be configured to provide the solution to the solver in multiple ways, depending on whether proof of optimality is required.\n",
    "\n",
    "## Primal component actions\n",
    "\n",
    "Before presenting the primal components themselves, we briefly discuss the three ways a solution may be provided to the solver. Each approach has benefits and limitations, which we also discuss in this section. All primal components can be configured to use any of the following approaches.\n",
    "\n",
    "The first approach is to provide the solution to the solver as a **warm start**. This is implemented by the class [SetWarmStart](SetWarmStart). The main advantage is that this method maintains all optimality and feasibility guarantees of the MIP solver, while still providing significant performance benefits for various classes of problems. If the machine learning model is able to predict multiple solutions, it is also possible to set multiple warm starts. In this case, the solver evaluates each warm start, discards the infeasible ones, then proceeds with the one that has the best objective value. The main disadvantage of this approach, compared to the next two, is that it provides relatively modest speedups for most problem classes, and no speedup at all for many others, even when the machine learning predictions are 100% accurate.\n",
    "\n",
    "[SetWarmStart]: ../../api/components/#miplearn.components.primal.actions.SetWarmStart\n",
    "\n",
    "The second approach is to **fix the decision variables** to their predicted values, then solve a restricted optimization problem on the remaining variables. This approach is implemented by the class `FixVariables`. The main advantage is its potential speedup: if machine learning can accurately predict values for a significant portion of the decision variables, then the MIP solver can typically complete the solution in a small fraction of the time it would take to find the same solution from scratch. The main disadvantage of this approach is that it loses optimality guarantees; that is, the complete solution found by the MIP solver may no longer be globally optimal. Also, if the machine learning predictions are not sufficiently accurate, there might not even be a feasible assignment for the variables that were left free.\n",
    "\n",
    "Finally, the third approach, which tries to strike a balance between the two previous ones, is to **enforce proximity** to a given solution. This strategy is implemented by the class `EnforceProximity`. More precisely, given values $\\bar{x}_1,\\ldots,\\bar{x}_n$ for a subset of binary decision variables $x_1,\\ldots,x_n$, this approach adds the constraint\n",
    "\n",
    "$$\n",
    "\\sum_{i : \\bar{x}_i=0} x_i + \\sum_{i : \\bar{x}_i=1} \\left(1 - x_i\\right) \\leq k,\n",
    "$$\n",
    "to the problem, where $k$ is a user-defined parameter, which indicates how many of the predicted variables are allowed to deviate from the machine learning suggestion. The main advantage of this approach, compared to fixing variables, is its tolerance to lower-quality machine learning predictions. Its main disadvantage is that it typically leads to smaller speedups, especially for larger values of $k$. This approach also loses optimality guarantees.\n",
    "\n",
    "## Memorizing primal component\n",
    "\n",
    "A simple machine learning strategy for the prediction of primal solutions is to memorize all distinct solutions seen during training, then try to predict, during inference time, which of those memorized solutions are most likely to be feasible and to provide a good objective value for the current instance. The most promising solutions may alternatively be combined into a single partial solution, which is then provided to the MIP solver. Both variations of this strategy are implemented by the `MemorizingPrimalComponent` class. Note that it is only applicable if the problem size, and in fact if the meaning of the decision variables, remains the same across problem instances.\n",
    "\n",
    "More precisely, let $I_1,\\ldots,I_n$ be the training instances, and let $\\bar{x}^1,\\ldots,\\bar{x}^n$ be their respective optimal solutions. Given a new instance $I_{n+1}$, `MemorizingPrimalComponent` expects a user-provided binary classifier that assigns (through the `predict_proba` method, following scikit-learn's conventions) a score $\\delta_i$ to each solution $\\bar{x}^i$, such that solutions with higher score are more likely to be good solutions for $I_{n+1}$. The features provided to the classifier are the instance features computed by an user-provided extractor. Given these scores, the component then performs one of the following to actions, as decided by the user:\n",
    "\n",
    "1. Selects the top $k$ solutions with the highest scores and provides them to the solver; this is implemented by `SelectTopSolutions`, and it is typically used with the `SetWarmStart` action.\n",
    "\n",
    "2. Merges the top $k$ solutions into a single partial solution, then provides it to the solver. This is implemented by `MergeTopSolutions`. More precisely, suppose that the machine learning regressor ordered the solutions in the sequence $\\bar{x}^{i_1},\\ldots,\\bar{x}^{i_n}$, with the most promising solutions appearing first, and with ties being broken arbitrarily. The component starts by keeping only the $k$ most promising solutions $\\bar{x}^{i_1},\\ldots,\\bar{x}^{i_k}$. Then it computes, for each binary decision variable $x_l$, its average assigned value $\\tilde{x}_l$:\n",
    "$$\n",
    "    \\tilde{x}_l = \\frac{1}{k} \\sum_{j=1}^k \\bar{x}^{i_j}_l.\n",
    "$$\n",
    "  Finally, the component constructs a merged solution $y$, defined as:\n",
    "$$\n",
    "    y_j = \\begin{cases}\n",
    "        0 & \\text{ if } \\tilde{x}_l \\le \\theta_0 \\\\\n",
    "        1 & \\text{ if } \\tilde{x}_l \\ge \\theta_1 \\\\\n",
    "        \\square & \\text{otherwise,}\n",
    "    \\end{cases}\n",
    "$$\n",
    "  where $\\theta_0$ and $\\theta_1$ are user-specified parameters, and where $\\square$ indicates that the variable is left undefined. The solution $y$ is then provided by the solver using any of the three approaches defined in the previous section.\n",
    "\n",
    "The above specification of `MemorizingPrimalComponent` is meant to be as general as possible. Simpler strategies can be implemented by configuring this component in specific ways. For example, a simpler approach employed in the literature is to collect all optimal solutions, then provide the entire list of solutions to the solver as warm starts, without any filtering or post-processing. This strategy can be implemented with `MemorizingPrimalComponent` by using a model that returns a constant value for all solutions (e.g. [scikit-learn's DummyClassifier][DummyClassifier]), then selecting the top $n$ (instead of $k$) solutions. See example below. Another simple approach is taking the solution to the most similar instance, and using it, by itself, as a warm start. This can be implemented by using a model that computes distances between the current instance and the training ones (e.g. [scikit-learn's KNeighborsClassifier][KNeighborsClassifier]), then select the solution to the nearest one. See also example below. More complex strategies, of course, can also be configured.\n",
    "\n",
    "[DummyClassifier]: https://scikit-learn.org/stable/modules/generated/sklearn.dummy.DummyClassifier.html\n",
    "[KNeighborsClassifier]: https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html\n",
    "\n",
    "### Examples"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "253adbf4",
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "from sklearn.dummy import DummyClassifier\n",
    "from sklearn.neighbors import KNeighborsClassifier\n",
    "\n",
    "from miplearn.components.primal.actions import (\n",
    "    SetWarmStart,\n",
    "    FixVariables,\n",
    "    EnforceProximity,\n",
    ")\n",
    "from miplearn.components.primal.mem import (\n",
    "    MemorizingPrimalComponent,\n",
    "    SelectTopSolutions,\n",
    "    MergeTopSolutions,\n",
    ")\n",
    "from miplearn.extractors.dummy import DummyExtractor\n",
    "from miplearn.extractors.fields import H5FieldsExtractor\n",
    "\n",
    "# Configures a memorizing primal component that collects\n",
    "# all distinct solutions seen during training and provides\n",
    "# them to the solver without any filtering or post-processing.\n",
    "comp1 = MemorizingPrimalComponent(\n",
    "    clf=DummyClassifier(),\n",
    "    extractor=DummyExtractor(),\n",
    "    constructor=SelectTopSolutions(1_000_000),\n",
    "    action=SetWarmStart(),\n",
    ")\n",
    "\n",
    "# Configures a memorizing primal component that finds the\n",
    "# training instance with the closest objective function, then\n",
    "# fixes the decision variables to the values they assumed\n",
    "# at the optimal solution for that instance.\n",
    "comp2 = MemorizingPrimalComponent(\n",
    "    clf=KNeighborsClassifier(n_neighbors=1),\n",
    "    extractor=H5FieldsExtractor(\n",
    "        instance_fields=[\"static_var_obj_coeffs\"],\n",
    "    ),\n",
    "    constructor=SelectTopSolutions(1),\n",
    "    action=FixVariables(),\n",
    ")\n",
    "\n",
    "# Configures a memorizing primal component that finds the distinct\n",
    "# solutions to the 10 most similar training problem instances,\n",
    "# selects the 3 solutions that were most often optimal to these\n",
    "# training instances, combines them into a single partial solution,\n",
    "# then enforces proximity, allowing at most 3 variables to deviate\n",
    "# from the machine learning suggestion.\n",
    "comp3 = MemorizingPrimalComponent(\n",
    "    clf=KNeighborsClassifier(n_neighbors=10),\n",
    "    extractor=H5FieldsExtractor(instance_fields=[\"static_var_obj_coeffs\"]),\n",
    "    constructor=MergeTopSolutions(k=3, thresholds=[0.25, 0.75]),\n",
    "    action=EnforceProximity(3),\n",
    ")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f194a793",
   "metadata": {},
   "source": [
    "## Independent vars primal component\n",
    "\n",
    "Instead of memorizing previously-seen primal solutions, it is also natural to use machine learning models to directly predict the values of the decision variables, constructing a solution from scratch. This approach has the benefit of potentially constructing novel high-quality solutions, never observed in the training data. Two variations of this strategy are supported by MIPLearn: (i) predicting the values of the decision variables independently, using multiple ML models; or (ii) predicting the values jointly, with a single model. We describe the first variation in this section, and the second variation in the next section.\n",
    "\n",
    "Let $I_1,\\ldots,I_n$ be the training instances, and let $\\bar{x}^1,\\ldots,\\bar{x}^n$ be their respective optimal solutions. For each binary decision variable $x_j$, the component `IndependentVarsPrimalComponent` creates a copy of a user-provided binary classifier and trains it to predict the optimal value of $x_j$, given $\\bar{x}^1_j,\\ldots,\\bar{x}^n_j$ as training labels. The features provided to the model are the variable features computed by an user-provided extractor. During inference time, the component uses these $n$ binary classifiers to construct a solution and provides it to the solver using one of the available actions.\n",
    "\n",
    "Three issues often arise in practice when using this approach:\n",
    "\n",
    " 1. For certain binary variables $x_j$, it is frequently the case that its optimal value is either always zero or always one in the training dataset, which poses problems to some standard scikit-learn classifiers, since they do not expect a single class. The wrapper `SingleClassFix` can be used to fix this issue (see example below).\n",
    "2. It is also frequently the case that machine learning classifier can only reliably predict the values of some variables with high accuracy, not all of them. In this situation, instead of computing a complete primal solution, it may be more beneficial to construct a partial solution containing values only for the variables for which the ML made a high-confidence prediction. The meta-classifier `MinProbabilityClassifier` can be used for this purpose. It asks the base classifier for the probability of the value being zero or one (using the `predict_proba` method) and erases from the primal solution all values whose probabilities are below a given threshold.\n",
    "3. To make multiple copies of the provided ML classifier, MIPLearn uses the standard `sklearn.base.clone` method, which may not be suitable for classifiers from other frameworks. To handle this, it is possible to override the clone function using the `clone_fn` constructor argument.\n",
    "\n",
    "### Examples"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "3fc0b5d1",
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "from sklearn.linear_model import LogisticRegression\n",
    "from miplearn.classifiers.minprob import MinProbabilityClassifier\n",
    "from miplearn.classifiers.singleclass import SingleClassFix\n",
    "from miplearn.components.primal.indep import IndependentVarsPrimalComponent\n",
    "from miplearn.extractors.AlvLouWeh2017 import AlvLouWeh2017Extractor\n",
    "from miplearn.components.primal.actions import SetWarmStart\n",
    "\n",
    "# Configures a primal component that independently predicts the value of each\n",
    "# binary variable using logistic regression and provides it to the solver as\n",
    "# warm start. Erases predictions with probability less than 99%; applies\n",
    "# single-class fix; and uses AlvLouWeh2017 features.\n",
    "comp = IndependentVarsPrimalComponent(\n",
    "    base_clf=SingleClassFix(\n",
    "        MinProbabilityClassifier(\n",
    "            base_clf=LogisticRegression(),\n",
    "            thresholds=[0.99, 0.99],\n",
    "        ),\n",
    "    ),\n",
    "    extractor=AlvLouWeh2017Extractor(),\n",
    "    action=SetWarmStart(),\n",
    ")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "45107a0c",
   "metadata": {},
   "source": [
    "## Joint vars primal component\n",
    "In the previous subsection, we used multiple machine learning models to independently predict the values of the binary decision variables. When these values are correlated, an alternative approach is to jointly predict the values of all binary variables using a single machine learning model. This strategy is implemented by `JointVarsPrimalComponent`. Compared to the previous ones, this component is much more straightforwad. It simply extracts instance features, using the user-provided feature extractor, then directly trains the user-provided binary classifier (using the `fit` method), without making any copies. The trained classifier is then used to predict entire solutions (using the `predict` method), which are given to the solver using one of the previously discussed methods. In the example below, we illustrate the usage of this component with a simple feed-forward neural network.\n",
    "\n",
    "`JointVarsPrimalComponent` can also be used to implement strategies that use multiple machine learning models, but not indepedently. For example, a common strategy in multioutput prediction is building a *classifier chain*. In this approach, the first decision variable is predicted using the instance features alone; but the $n$-th decision variable is predicted using the instance features plus the predicted values of the $n-1$ previous variables. This can be easily implemented using scikit-learn's `ClassifierChain` estimator, as shown in the example below.\n",
    "\n",
    "### Examples"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "cf9b52dd",
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "from sklearn.multioutput import ClassifierChain\n",
    "from sklearn.neural_network import MLPClassifier\n",
    "from miplearn.components.primal.joint import JointVarsPrimalComponent\n",
    "from miplearn.extractors.fields import H5FieldsExtractor\n",
    "from miplearn.components.primal.actions import SetWarmStart\n",
    "\n",
    "# Configures a primal component that uses a feedforward neural network\n",
    "# to jointly predict the values of the binary variables, based on the\n",
    "# objective cost function, and provides the solution to the solver as\n",
    "# a warm start.\n",
    "comp = JointVarsPrimalComponent(\n",
    "    clf=MLPClassifier(),\n",
    "    extractor=H5FieldsExtractor(\n",
    "        instance_fields=[\"static_var_obj_coeffs\"],\n",
    "    ),\n",
    "    action=SetWarmStart(),\n",
    ")\n",
    "\n",
    "# Configures a primal component that uses a chain of logistic regression\n",
    "# models to jointly predict the values of the binary variables, based on\n",
    "# the objective function.\n",
    "comp = JointVarsPrimalComponent(\n",
    "    clf=ClassifierChain(SingleClassFix(LogisticRegression())),\n",
    "    extractor=H5FieldsExtractor(\n",
    "        instance_fields=[\"static_var_obj_coeffs\"],\n",
    "    ),\n",
    "    action=SetWarmStart(),\n",
    ")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dddf7be4",
   "metadata": {},
   "source": [
    "## Expert primal component\n",
    "\n",
    "Before spending time and effort choosing a machine learning strategy and tweaking its parameters, it is usually a good idea to evaluate what would be the performance impact of the model if its predictions were 100% accurate. This is especially important for the prediction of warm starts, since they are not always very beneficial. To simplify this task, MIPLearn provides `ExpertPrimalComponent`, a component which simply loads the optimal solution from the HDF5 file, assuming that it has already been computed, then directly provides it to the solver using one of the available methods. This component is useful in benchmarks, to evaluate how close to the best theoretical performance the machine learning components are.\n",
    "\n",
    "### Example"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "9e2e81b9",
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "from miplearn.components.primal.expert import ExpertPrimalComponent\n",
    "from miplearn.components.primal.actions import SetWarmStart\n",
    "\n",
    "# Configures an expert primal component, which reads a pre-computed\n",
    "# optimal solution from the HDF5 file and provides it to the solver\n",
    "# as warm start.\n",
    "comp = ExpertPrimalComponent(action=SetWarmStart())\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.16"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
 }
--- a/docs/guide/problems.ipynb
+++ b/docs/guide/problems.ipynb
--- a/docs/guide/solvers.ipynb
+++ b/docs/guide/solvers.ipynb
@@ -0,0 +1,247 @@
 {
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "9ec1907b-db93-4840-9439-c9005902b968",
   "metadata": {},
   "source": [
    "# Learning Solver\n",
    "\n",
    "On previous pages, we discussed various components of the MIPLearn framework, including training data collectors, feature extractors, and individual machine learning components. In this page, we introduce **LearningSolver**, the main class of the framework which integrates all the aforementioned components into a cohesive whole. Using **LearningSolver** involves three steps: (i) configuring the solver; (ii) training the ML components; and (iii) solving new MIP instances. In the following, we describe each of these steps, then conclude with a complete runnable example.\n",
    "\n",
    "### Configuring the solver\n",
    "\n",
    "**LearningSolver** is composed by multiple individual machine learning components, each targeting a different part of the solution process, or implementing a different machine learning strategy. This architecture allows strategies to be easily enabled, disabled or customized, making the framework flexible. By default, no components are provided and **LearningSolver** is equivalent to a traditional MIP solver. To specify additional components, the `components` constructor argument may be used:\n",
    "\n",
    "```python\n",
    "solver = LearningSolver(\n",
    "    components=[\n",
    "        comp1,\n",
    "        comp2,\n",
    "        comp3,\n",
    "    ]\n",
    ")\n",
    "```\n",
    "\n",
    "In this example, three components `comp1`, `comp2` and `comp3` are provided. The strategies implemented by these components are applied sequentially when solving the problem. For example, `comp1` and `comp2` could fix a subset of decision variables, while `comp3` constructs a warm start for the remaining problem.\n",
    "\n",
    "### Training and solving new instances\n",
    "\n",
    "Once a solver is configured, its ML components need to be trained. This can be achieved by the `solver.fit` method, as illustrated below. The method accepts a list of HDF5 files and trains each individual component sequentially. Once the solver is trained, new instances can be solved using `solver.optimize`. The method returns a dictionary of statistics collected by each component, such as the number of variables fixed.\n",
    "\n",
    "```python\n",
    "# Build instances\n",
    "train_data = ...\n",
    "test_data = ...\n",
    "\n",
    "# Collect training data\n",
    "bc = BasicCollector()\n",
    "bc.collect(train_data, build_model)\n",
    "\n",
    "# Build solver\n",
    "solver = LearningSolver(...)\n",
    "\n",
    "# Train components\n",
    "solver.fit(train_data)\n",
    "\n",
    "# Solve a new test instance\n",
    "stats = solver.optimize(test_data[0], build_model)\n",
    "\n",
    "```\n",
    "\n",
    "### Complete example\n",
    "\n",
    "In the example below, we illustrate the usage of **LearningSolver** by building instances of the Traveling Salesman Problem, collecting training data, training the ML components, then solving a new instance."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "92b09b98",
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Gurobi Optimizer version 10.0.1 build v10.0.1rc0 (linux64)\n",
      "\n",
      "CPU model: AMD Ryzen 9 7950X 16-Core Processor, instruction set [SSE2|AVX|AVX2|AVX512]\n",
      "Thread count: 16 physical cores, 32 logical processors, using up to 32 threads\n",
      "\n",
      "Optimize a model with 10 rows, 45 columns and 90 nonzeros\n",
      "Model fingerprint: 0x6ddcd141\n",
      "Coefficient statistics:\n",
      "  Matrix range     [1e+00, 1e+00]\n",
      "  Objective range  [4e+01, 1e+03]\n",
      "  Bounds range     [1e+00, 1e+00]\n",
      "  RHS range        [2e+00, 2e+00]\n",
      "Presolve time: 0.00s\n",
      "Presolved: 10 rows, 45 columns, 90 nonzeros\n",
      "\n",
      "Iteration    Objective       Primal Inf.    Dual Inf.      Time\n",
      "       0    6.3600000e+02   1.700000e+01   0.000000e+00      0s\n",
      "      15    2.7610000e+03   0.000000e+00   0.000000e+00      0s\n",
      "\n",
      "Solved in 15 iterations and 0.00 seconds (0.00 work units)\n",
      "Optimal objective  2.761000000e+03\n",
      "Set parameter LazyConstraints to value 1\n",
      "Gurobi Optimizer version 10.0.1 build v10.0.1rc0 (linux64)\n",
      "\n",
      "CPU model: AMD Ryzen 9 7950X 16-Core Processor, instruction set [SSE2|AVX|AVX2|AVX512]\n",
      "Thread count: 16 physical cores, 32 logical processors, using up to 32 threads\n",
      "\n",
      "Optimize a model with 10 rows, 45 columns and 90 nonzeros\n",
      "Model fingerprint: 0x74ca3d0a\n",
      "Variable types: 0 continuous, 45 integer (45 binary)\n",
      "Coefficient statistics:\n",
      "  Matrix range     [1e+00, 1e+00]\n",
      "  Objective range  [4e+01, 1e+03]\n",
      "  Bounds range     [1e+00, 1e+00]\n",
      "  RHS range        [2e+00, 2e+00]\n",
      "\n",
      "User MIP start produced solution with objective 2796 (0.00s)\n",
      "Loaded user MIP start with objective 2796\n",
      "\n",
      "Presolve time: 0.00s\n",
      "Presolved: 10 rows, 45 columns, 90 nonzeros\n",
      "Variable types: 0 continuous, 45 integer (45 binary)\n",
      "\n",
      "Root relaxation: objective 2.761000e+03, 14 iterations, 0.00 seconds (0.00 work units)\n",
      "\n",
      "    Nodes    |    Current Node    |     Objective Bounds      |     Work\n",
      " Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time\n",
      "\n",
      "     0     0 2761.00000    0    - 2796.00000 2761.00000  1.25%     -    0s\n",
      "     0     0     cutoff    0      2796.00000 2796.00000  0.00%     -    0s\n",
      "\n",
      "Cutting planes:\n",
      "  Lazy constraints: 3\n",
      "\n",
      "Explored 1 nodes (16 simplex iterations) in 0.01 seconds (0.00 work units)\n",
      "Thread count was 32 (of 32 available processors)\n",
      "\n",
      "Solution count 1: 2796 \n",
      "\n",
      "Optimal solution found (tolerance 1.00e-04)\n",
      "Best objective 2.796000000000e+03, best bound 2.796000000000e+03, gap 0.0000%\n",
      "\n",
      "User-callback calls 110, time in user-callback 0.00 sec\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "{'WS: Count': 1, 'WS: Number of variables set': 41.0}"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import random\n",
    "\n",
    "import numpy as np\n",
    "from scipy.stats import uniform, randint\n",
    "from sklearn.linear_model import LogisticRegression\n",
    "\n",
    "from miplearn.classifiers.minprob import MinProbabilityClassifier\n",
    "from miplearn.classifiers.singleclass import SingleClassFix\n",
    "from miplearn.collectors.basic import BasicCollector\n",
    "from miplearn.components.primal.actions import SetWarmStart\n",
    "from miplearn.components.primal.indep import IndependentVarsPrimalComponent\n",
    "from miplearn.extractors.AlvLouWeh2017 import AlvLouWeh2017Extractor\n",
    "from miplearn.io import write_pkl_gz\n",
    "from miplearn.problems.tsp import (\n",
    "    TravelingSalesmanGenerator,\n",
    "    build_tsp_model,\n",
    ")\n",
    "from miplearn.solvers.learning import LearningSolver\n",
    "\n",
    "# Set random seed to make example reproducible.\n",
    "random.seed(42)\n",
    "np.random.seed(42)\n",
    "\n",
    "# Generate a few instances of the traveling salesman problem.\n",
    "data = TravelingSalesmanGenerator(\n",
    "    n=randint(low=10, high=11),\n",
    "    x=uniform(loc=0.0, scale=1000.0),\n",
    "    y=uniform(loc=0.0, scale=1000.0),\n",
    "    gamma=uniform(loc=0.90, scale=0.20),\n",
    "    fix_cities=True,\n",
    "    round=True,\n",
    ").generate(50)\n",
    "\n",
    "# Save instance data to data/tsp/00000.pkl.gz, data/tsp/00001.pkl.gz, ...\n",
    "all_data = write_pkl_gz(data, \"data/tsp\")\n",
    "\n",
    "# Split train/test data\n",
    "train_data = all_data[:40]\n",
    "test_data = all_data[40:]\n",
    "\n",
    "# Collect training data\n",
    "bc = BasicCollector()\n",
    "bc.collect(train_data, build_tsp_model, n_jobs=4)\n",
    "\n",
    "# Build learning solver\n",
    "solver = LearningSolver(\n",
    "    components=[\n",
    "        IndependentVarsPrimalComponent(\n",
    "            base_clf=SingleClassFix(\n",
    "                MinProbabilityClassifier(\n",
    "                    base_clf=LogisticRegression(),\n",
    "                    thresholds=[0.95, 0.95],\n",
    "                ),\n",
    "            ),\n",
    "            extractor=AlvLouWeh2017Extractor(),\n",
    "            action=SetWarmStart(),\n",
    "        )\n",
    "    ]\n",
    ")\n",
    "\n",
    "# Train ML models\n",
    "solver.fit(train_data)\n",
    "\n",
    "# Solve a test instance\n",
    "solver.optimize(test_data[0], build_tsp_model)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e27d2cbd-5341-461d-bbc1-8131aee8d949",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
 }
--- a/docs/index.md
+++ b/docs/index.md
@@ -1,113 +0,0 @@
 # MIPLearn
 **MIPLearn** is an extensible framework for solving discrete optimization problems using a combination of Mixed-Integer Linear Programming (MIP) and Machine Learning (ML). The framework uses ML methods to automatically identify patterns in previously solved instances of the problem, then uses these patterns to accelerate the performance of conventional state-of-the-art MIP solvers (such as CPLEX, Gurobi or XPRESS).
 Unlike pure ML methods, MIPLearn is not only able to find high-quality solutions to discrete optimization problems, but it can also prove the optimality and feasibility of these solutions.
 Unlike conventional MIP solvers, MIPLearn can take full advantage of very specific observations that happen to be true in a particular family of instances (such as the observation that a particular constraint is typically redundant, or that a particular variable typically assumes a certain value). 
 ## Table of Contents
 ```{toctree}
 ---
 maxdepth: 1
 caption: JuMP Tutorials
 numbered: true
 ---
 jump-tutorials/getting-started.ipynb
 #jump-tutorials/lazy-constraints.ipynb
 #jump-tutorials/user-cuts.ipynb
 #jump-tutorials/customizing-ml.ipynb
 ```
 ```{toctree}
 ---
 maxdepth: 1
 caption: Pyomo Tutorials
 numbered: true
 ---
 pyomo-tutorials/getting-started.ipynb
 #pyomo-tutorials/lazy-constraints.ipynb
 #pyomo-tutorials/user-cuts.ipynb
 #pyomo-tutorials/customizing-ml.ipynb
 ```
 ```{toctree}
 ---
 maxdepth: 1
 caption: Benchmarks
 numbered: true
 ---
 benchmarks/preliminaries.ipynb
 benchmarks/stab.ipynb
 #benchmarks/uc.ipynb
 #benchmarks/facility.ipynb
 benchmarks/knapsack.ipynb
 benchmarks/tsp.ipynb
 ```
 ```{toctree}
 ---
 maxdepth: 1
 caption: MIPLearn Internals
 numbered: true
 ---
 #internals/solver-interfaces.ipynb
 #internals/data-collection.ipynb
 #internals/abstract-component.ipynb
 #internals/primal.ipynb
 #internals/static-lazy.ipynb
 #internals/dynamic-lazy.ipynb
 ```
 ## Source Code
 * [https://github.com/ANL-CEEESA/MIPLearn](https://github.com/ANL-CEEESA/MIPLearn)
 ## Authors
 * **Alinson S. Xavier,** Argonne National Laboratory <<axavier@anl.gov>>
 * **Feng Qiu,** Argonne National Laboratory <<fqiu@anl.gov>>
 ## Acknowledgments
 * Based upon work supported by U.S. Department of Energy **Advanced Grid Modeling Program** under Grant DE-OE0000875.
 * Based upon work supported by **Laboratory Directed Research and Development** (LDRD) funding from Argonne National Laboratory, provided by the Director, Office of Science, of the U.S. Department of Energy under Contract No. DE-AC02-06CH11357
 ## References
 If you use MIPLearn in your research, or the included problem generators, we kindly request that you cite the package as follows:
 - **Alinson S. Xavier, Feng Qiu.** *MIPLearn: An Extensible Framework for Learning-Enhanced Optimization*. Zenodo (2020). DOI: [10.5281/zenodo.4287567](https://doi.org/10.5281/zenodo.4287567)
 If you use MIPLearn in the field of power systems optimization, we kindly request that you cite the reference below, in which the main techniques implemented in MIPLearn were first developed:
 - **Alinson S. Xavier, Feng Qiu, Shabbir Ahmed.** *Learning to Solve Large-Scale Unit Commitment Problems.* INFORMS Journal on Computing (2021). DOI: [10.1287/ijoc.2020.0976](https://doi.org/10.1287/ijoc.2020.0976)
 ## License
 ```text
 MIPLearn, an extensible framework for Learning-Enhanced Mixed-Integer Optimization
 Copyright © 2020, UChicago Argonne, LLC. All Rights Reserved.
 Redistribution and use in source and binary forms, with or without modification, are permitted
 provided that the following conditions are met:
 1. Redistributions of source code must retain the above copyright notice, this list of
   conditions and the following disclaimer.
 2. Redistributions in binary form must reproduce the above copyright notice, this list of
   conditions and the following disclaimer in the documentation and/or other materials provided
   with the distribution.
 3. Neither the name of the copyright holder nor the names of its contributors may be used to
   endorse or promote products derived from this software without specific prior written
   permission.
 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR
 IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY
 AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR
 CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
 CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
 SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
 THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
 OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
 POSSIBILITY OF SUCH DAMAGE.
 ```
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -0,0 +1,67 @@
 MIPLearn
 ========
 **MIPLearn** is an extensible framework for solving discrete optimization problems using a combination of Mixed-Integer Linear Programming (MIP) and Machine Learning (ML). MIPLearn uses ML methods to automatically identify patterns in previously solved instances of the problem, then uses these patterns to accelerate the performance of conventional state-of-the-art MIP solvers such as CPLEX, Gurobi or XPRESS.
 Unlike pure ML methods, MIPLearn is not only able to find high-quality solutions to discrete optimization problems, but it can also prove the optimality and feasibility of these solutions. Unlike conventional MIP solvers, MIPLearn can take full advantage of very specific observations that happen to be true in a particular family of instances (such as the observation that a particular constraint is typically redundant, or that a particular variable typically assumes a certain value). For certain classes of problems, this approach may provide significant performance benefits.
 Contents
 --------
 .. toctree::
   :maxdepth: 1
   :caption: Tutorials
   :numbered: 2
   tutorials/getting-started-pyomo
   tutorials/getting-started-gurobipy
   tutorials/getting-started-jump
 .. toctree::
   :maxdepth: 2
   :caption: User Guide
   :numbered: 2
   guide/problems
   guide/collectors
   guide/features
   guide/primal
   guide/solvers
 .. toctree::
   :maxdepth: 1
   :caption: Python API Reference
   :numbered: 2
   api/problems
   api/collectors
   api/components
   api/solvers
   api/helpers
 Authors
 -------
 - **Alinson S. Xavier** (Argonne National Laboratory)
 - **Feng Qiu** (Argonne National Laboratory)
 - **Xiaoyi Gu** (Georgia Institute of Technology)
 - **Berkay Becu** (Georgia Institute of Technology)
 - **Santanu S. Dey**  (Georgia Institute of Technology)
 Acknowledgments
 ---------------
 * Based upon work supported by **Laboratory Directed Research and Development** (LDRD) funding from Argonne National Laboratory, provided by the Director, Office of Science, of the U.S. Department of Energy.
 * Based upon work supported by the **U.S. Department of Energy Advanced Grid Modeling Program**.
 Citing MIPLearn
 ---------------
 If you use MIPLearn in your research (either the solver or the included problem generators), we kindly request that you cite the package as follows:
 * **Alinson S. Xavier, Feng Qiu, Xiaoyi Gu, Berkay Becu, Santanu S. Dey.** *MIPLearn: An Extensible Framework for Learning-Enhanced Optimization (Version 0.3)*. Zenodo (2023). DOI: https://doi.org/10.5281/zenodo.4287567
 If you use MIPLearn in the field of power systems optimization, we kindly request that you cite the reference below, in which the main techniques implemented in MIPLearn were first developed:
 * **Alinson S. Xavier, Feng Qiu, Shabbir Ahmed.** *Learning to Solve Large-Scale Unit Commitment Problems.* INFORMS Journal on Computing (2020). DOI: https://doi.org/10.1287/ijoc.2020.0976
--- a/docs/internals/abstract-component.ipynb
+++ b/docs/internals/abstract-component.ipynb
@@ -1,29 +0,0 @@
 {
 "cells": [
  {
   "cell_type": "markdown",
   "id": "ad9274ff",
   "metadata": {},
   "source": [
    "# Abstract component\n",
    "\n",
    "TODO"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Julia 1.6.0",
   "language": "julia",
   "name": "julia-1.6"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "1.6.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
 }
--- a/docs/internals/data-collection.ipynb
+++ b/docs/internals/data-collection.ipynb
@@ -1,29 +0,0 @@
 {
 "cells": [
  {
   "cell_type": "markdown",
   "id": "780b4172",
   "metadata": {},
   "source": [
    "# Training data collection\n",
    "\n",
    "TODO"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Julia 1.6.0",
   "language": "julia",
   "name": "julia-1.6"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "1.6.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
 }
--- a/docs/internals/dynamic-lazy.ipynb
+++ b/docs/internals/dynamic-lazy.ipynb
@@ -1,29 +0,0 @@
 {
 "cells": [
  {
   "cell_type": "markdown",
   "id": "5e3dd4c0",
   "metadata": {},
   "source": [
    "# Dynamic lazy constraints & user cuts\n",
    "\n",
    "TODO"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Julia 1.6.0",
   "language": "julia",
   "name": "julia-1.6"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "1.6.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
 }
--- a/docs/internals/primal.ipynb
+++ b/docs/internals/primal.ipynb
@@ -1,29 +0,0 @@
 {
 "cells": [
  {
   "cell_type": "markdown",
   "id": "c6d0d8dc",
   "metadata": {},
   "source": [
    "# Primal solutions\n",
    "\n",
    "TODO"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Julia 1.6.0",
   "language": "julia",
   "name": "julia-1.6"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "1.6.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
 }
--- a/docs/internals/solver-interfaces.ipynb
+++ b/docs/internals/solver-interfaces.ipynb
@@ -1,29 +0,0 @@
 {
 "cells": [
  {
   "cell_type": "markdown",
   "id": "ac509ea5",
   "metadata": {},
   "source": [
    "# Solver interfaces\n",
    "\n",
    "TODO"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Julia 1.6.0",
   "language": "julia",
   "name": "julia-1.6"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "1.6.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
 }
--- a/docs/internals/static-lazy.ipynb
+++ b/docs/internals/static-lazy.ipynb
@@ -1,29 +0,0 @@
 {
 "cells": [
  {
   "cell_type": "markdown",
   "id": "ae350662",
   "metadata": {},
   "source": [
    "# Static lazy constraints\n",
    "\n",
    "TODO"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Julia 1.6.0",
   "language": "julia",
   "name": "julia-1.6"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "1.6.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
 }
--- a/docs/jump-tutorials/Manifest.toml
+++ b/docs/jump-tutorials/Manifest.toml
@@ -1,772 +0,0 @@
 # This file is machine-generated - editing it directly is not advised
 [[ASL_jll]]
 deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"]
 git-tree-sha1 = "6252039f98492252f9e47c312c8ffda0e3b9e78d"
 uuid = "ae81ac8f-d209-56e5-92de-9978fef736f9"
 version = "0.1.3+0"
 [[ArgTools]]
 uuid = "0dad84c5-d112-42e6-8d28-ef12dabb789f"
 [[Artifacts]]
 uuid = "56f22d72-fd6d-98f1-02f0-08ddc0907c33"
 [[Base64]]
 uuid = "2a0f44e3-6c83-55bd-87e4-b1978d98bd5f"
 [[BenchmarkTools]]
 deps = ["JSON", "Logging", "Printf", "Profile", "Statistics", "UUIDs"]
 git-tree-sha1 = "4c10eee4af024676200bc7752e536f858c6b8f93"
 uuid = "6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf"
 version = "1.3.1"
 [[BinaryProvider]]
 deps = ["Libdl", "Logging", "SHA"]
 git-tree-sha1 = "ecdec412a9abc8db54c0efc5548c64dfce072058"
 uuid = "b99e7846-7c00-51b0-8f62-c81ae34c0232"
 version = "0.5.10"
 [[Bzip2_jll]]
 deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"]
 git-tree-sha1 = "19a35467a82e236ff51bc17a3a44b69ef35185a2"
 uuid = "6e34b625-4abd-537c-b88f-471c36dfa7a0"
 version = "1.0.8+0"
 [[CEnum]]
 git-tree-sha1 = "215a9aa4a1f23fbd05b92769fdd62559488d70e9"
 uuid = "fa961155-64e5-5f13-b03f-caf6b980ea82"
 version = "0.4.1"
 [[CSV]]
 deps = ["CodecZlib", "Dates", "FilePathsBase", "InlineStrings", "Mmap", "Parsers", "PooledArrays", "SentinelArrays", "Tables", "Unicode", "WeakRefStrings"]
 git-tree-sha1 = "9519274b50500b8029973d241d32cfbf0b127d97"
 uuid = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b"
 version = "0.10.2"
 [[Calculus]]
 deps = ["LinearAlgebra"]
 git-tree-sha1 = "f641eb0a4f00c343bbc32346e1217b86f3ce9dad"
 uuid = "49dc2e85-a5d0-5ad3-a950-438e2897f1b9"
 version = "0.5.1"
 [[Cbc]]
 deps = ["BinaryProvider", "CEnum", "Cbc_jll", "Libdl", "MathOptInterface", "SparseArrays"]
 git-tree-sha1 = "98e3692f90b26a340f32e17475c396c3de4180de"
 uuid = "9961bab8-2fa3-5c5a-9d89-47fab24efd76"
 version = "0.8.1"
 [[Cbc_jll]]
 deps = ["ASL_jll", "Artifacts", "Cgl_jll", "Clp_jll", "CoinUtils_jll", "CompilerSupportLibraries_jll", "JLLWrappers", "Libdl", "OpenBLAS32_jll", "Osi_jll", "Pkg"]
 git-tree-sha1 = "7693a7ca006d25e0d0097a5eee18ce86368e00cd"
 uuid = "38041ee0-ae04-5750-a4d2-bb4d0d83d27d"
 version = "200.1000.500+1"
 [[Cgl_jll]]
 deps = ["Artifacts", "Clp_jll", "CoinUtils_jll", "CompilerSupportLibraries_jll", "JLLWrappers", "Libdl", "Osi_jll", "Pkg"]
 git-tree-sha1 = "b5557f48e0e11819bdbda0200dbfa536dd12d9d9"
 uuid = "3830e938-1dd0-5f3e-8b8e-b3ee43226782"
 version = "0.6000.200+0"
 [[ChainRulesCore]]
 deps = ["Compat", "LinearAlgebra", "SparseArrays"]
 git-tree-sha1 = "c9a6160317d1abe9c44b3beb367fd448117679ca"
 uuid = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4"
 version = "1.13.0"
 [[ChangesOfVariables]]
 deps = ["ChainRulesCore", "LinearAlgebra", "Test"]
 git-tree-sha1 = "bf98fa45a0a4cee295de98d4c1462be26345b9a1"
 uuid = "9e997f8a-9a97-42d5-a9f1-ce6bfc15e2c0"
 version = "0.1.2"
 [[Clp]]
 deps = ["BinaryProvider", "CEnum", "Clp_jll", "Libdl", "MathOptInterface", "SparseArrays"]
 git-tree-sha1 = "3df260c4a5764858f312ec2a17f5925624099f3a"
 uuid = "e2554f3b-3117-50c0-817c-e040a3ddf72d"
 version = "0.8.4"
 [[Clp_jll]]
 deps = ["Artifacts", "CoinUtils_jll", "CompilerSupportLibraries_jll", "JLLWrappers", "Libdl", "METIS_jll", "MUMPS_seq_jll", "OpenBLAS32_jll", "Osi_jll", "Pkg"]
 git-tree-sha1 = "5e4f9a825408dc6356e6bf1015e75d2b16250ec8"
 uuid = "06985876-5285-5a41-9fcb-8948a742cc53"
 version = "100.1700.600+0"
 [[CodeTracking]]
 deps = ["InteractiveUtils", "UUIDs"]
 git-tree-sha1 = "759a12cefe1cd1bb49e477bc3702287521797483"
 uuid = "da1fd8a2-8d9e-5ec2-8556-3022fb5608a2"
 version = "1.0.7"
 [[CodecBzip2]]
 deps = ["Bzip2_jll", "Libdl", "TranscodingStreams"]
 git-tree-sha1 = "2e62a725210ce3c3c2e1a3080190e7ca491f18d7"
 uuid = "523fee87-0ab8-5b00-afb7-3ecf72e48cfd"
 version = "0.7.2"
 [[CodecZlib]]
 deps = ["TranscodingStreams", "Zlib_jll"]
 git-tree-sha1 = "ded953804d019afa9a3f98981d99b33e3db7b6da"
 uuid = "944b1d66-785c-5afd-91f1-9de20f533193"
 version = "0.7.0"
 [[CoinUtils_jll]]
 deps = ["Artifacts", "CompilerSupportLibraries_jll", "JLLWrappers", "Libdl", "OpenBLAS32_jll", "Pkg"]
 git-tree-sha1 = "44173e61256f32918c6c132fc41f772bab1fb6d1"
 uuid = "be027038-0da8-5614-b30d-e42594cb92df"
 version = "200.1100.400+0"
 [[CommonSubexpressions]]
 deps = ["MacroTools", "Test"]
 git-tree-sha1 = "7b8a93dba8af7e3b42fecabf646260105ac373f7"
 uuid = "bbf7d656-a473-5ed7-a52c-81e309532950"
 version = "0.3.0"
 [[Compat]]
 deps = ["Base64", "Dates", "DelimitedFiles", "Distributed", "InteractiveUtils", "LibGit2", "Libdl", "LinearAlgebra", "Markdown", "Mmap", "Pkg", "Printf", "REPL", "Random", "SHA", "Serialization", "SharedArrays", "Sockets", "SparseArrays", "Statistics", "Test", "UUIDs", "Unicode"]
 git-tree-sha1 = "44c37b4636bc54afac5c574d2d02b625349d6582"
 uuid = "34da2185-b29b-5c13-b0c7-acf172513d20"
 version = "3.41.0"
 [[CompilerSupportLibraries_jll]]
 deps = ["Artifacts", "Libdl"]
 uuid = "e66e0078-7015-5450-92f7-15fbd957f2ae"
 [[Conda]]
 deps = ["Downloads", "JSON", "VersionParsing"]
 git-tree-sha1 = "6e47d11ea2776bc5627421d59cdcc1296c058071"
 uuid = "8f4d0f93-b110-5947-807f-2305c1781a2d"
 version = "1.7.0"
 [[Crayons]]
 git-tree-sha1 = "249fe38abf76d48563e2f4556bebd215aa317e15"
 uuid = "a8cc5b0e-0ffa-5ad4-8c14-923d3ee1735f"
 version = "4.1.1"
 [[DataAPI]]
 git-tree-sha1 = "cc70b17275652eb47bc9e5f81635981f13cea5c8"
 uuid = "9a962f9c-6df0-11e9-0e5d-c546b8b5ee8a"
 version = "1.9.0"
 [[DataFrames]]
 deps = ["Compat", "DataAPI", "Future", "InvertedIndices", "IteratorInterfaceExtensions", "LinearAlgebra", "Markdown", "Missings", "PooledArrays", "PrettyTables", "Printf", "REPL", "Reexport", "SortingAlgorithms", "Statistics", "TableTraits", "Tables", "Unicode"]
 git-tree-sha1 = "ae02104e835f219b8930c7664b8012c93475c340"
 uuid = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
 version = "1.3.2"
 [[DataStructures]]
 deps = ["Compat", "InteractiveUtils", "OrderedCollections"]
 git-tree-sha1 = "3daef5523dd2e769dad2365274f760ff5f282c7d"
 uuid = "864edb3b-99cc-5e75-8d2d-829cb0a9cfe8"
 version = "0.18.11"
 [[DataValueInterfaces]]
 git-tree-sha1 = "bfc1187b79289637fa0ef6d4436ebdfe6905cbd6"
 uuid = "e2d170a0-9d28-54be-80f0-106bbe20a464"
 version = "1.0.0"
 [[Dates]]
 deps = ["Printf"]
 uuid = "ade2ca70-3891-5945-98fb-dc099432e06a"
 [[DelimitedFiles]]
 deps = ["Mmap"]
 uuid = "8bb1440f-4735-579b-a4ab-409b98df4dab"
 [[DensityInterface]]
 deps = ["InverseFunctions", "Test"]
 git-tree-sha1 = "80c3e8639e3353e5d2912fb3a1916b8455e2494b"
 uuid = "b429d917-457f-4dbc-8f4c-0cc954292b1d"
 version = "0.4.0"
 [[DiffResults]]
 deps = ["StaticArrays"]
 git-tree-sha1 = "c18e98cba888c6c25d1c3b048e4b3380ca956805"
 uuid = "163ba53b-c6d8-5494-b064-1a9d43ac40c5"
 version = "1.0.3"
 [[DiffRules]]
 deps = ["IrrationalConstants", "LogExpFunctions", "NaNMath", "Random", "SpecialFunctions"]
 git-tree-sha1 = "dd933c4ef7b4c270aacd4eb88fa64c147492acf0"
 uuid = "b552c78f-8df3-52c6-915a-8e097449b14b"
 version = "1.10.0"
 [[Distributed]]
 deps = ["Random", "Serialization", "Sockets"]
 uuid = "8ba89e20-285c-5b6f-9357-94700520ee1b"
 [[Distributions]]
 deps = ["ChainRulesCore", "DensityInterface", "FillArrays", "LinearAlgebra", "PDMats", "Printf", "QuadGK", "Random", "SparseArrays", "SpecialFunctions", "Statistics", "StatsBase", "StatsFuns", "Test"]
 git-tree-sha1 = "9d3c0c762d4666db9187f363a76b47f7346e673b"
 uuid = "31c24e10-a181-5473-b8eb-7969acd0382f"
 version = "0.25.49"
 [[DocStringExtensions]]
 deps = ["LibGit2"]
 git-tree-sha1 = "b19534d1895d702889b219c382a6e18010797f0b"
 uuid = "ffbed154-4ef7-542d-bbb7-c09d3a79fcae"
 version = "0.8.6"
 [[Downloads]]
 deps = ["ArgTools", "LibCURL", "NetworkOptions"]
 uuid = "f43a241f-c20a-4ad4-852c-f6b1247861c6"
 [[DualNumbers]]
 deps = ["Calculus", "NaNMath", "SpecialFunctions"]
 git-tree-sha1 = "84f04fe68a3176a583b864e492578b9466d87f1e"
 uuid = "fa6b7ba4-c1ee-5f82-b5fc-ecf0adba8f74"
 version = "0.6.6"
 [[ExprTools]]
 git-tree-sha1 = "56559bbef6ca5ea0c0818fa5c90320398a6fbf8d"
 uuid = "e2ba6199-217a-4e67-a87a-7c52f15ade04"
 version = "0.1.8"
 [[FileIO]]
 deps = ["Pkg", "Requires", "UUIDs"]
 git-tree-sha1 = "80ced645013a5dbdc52cf70329399c35ce007fae"
 uuid = "5789e2e9-d7fb-5bc7-8068-2c6fae9b9549"
 version = "1.13.0"
 [[FilePathsBase]]
 deps = ["Compat", "Dates", "Mmap", "Printf", "Test", "UUIDs"]
 git-tree-sha1 = "04d13bfa8ef11720c24e4d840c0033d145537df7"
 uuid = "48062228-2e41-5def-b9a4-89aafe57970f"
 version = "0.9.17"
 [[FileWatching]]
 uuid = "7b1f6079-737a-58dc-b8bc-7a2ca5c1b5ee"
 [[FillArrays]]
 deps = ["LinearAlgebra", "Random", "SparseArrays", "Statistics"]
 git-tree-sha1 = "4c7d3757f3ecbcb9055870351078552b7d1dbd2d"
 uuid = "1a297f60-69ca-5386-bcde-b61e274b549b"
 version = "0.13.0"
 [[Formatting]]
 deps = ["Printf"]
 git-tree-sha1 = "8339d61043228fdd3eb658d86c926cb282ae72a8"
 uuid = "59287772-0a20-5a39-b81b-1366585eb4c0"
 version = "0.4.2"
 [[ForwardDiff]]
 deps = ["CommonSubexpressions", "DiffResults", "DiffRules", "LinearAlgebra", "LogExpFunctions", "NaNMath", "Preferences", "Printf", "Random", "SpecialFunctions", "StaticArrays"]
 git-tree-sha1 = "1bd6fc0c344fc0cbee1f42f8d2e7ec8253dda2d2"
 uuid = "f6369f11-7733-5829-9624-2563aa707210"
 version = "0.10.25"
 [[Future]]
 deps = ["Random"]
 uuid = "9fa8497b-333b-5362-9e8d-4d0656e87820"
 [[GMP_jll]]
 deps = ["Artifacts", "Libdl"]
 uuid = "781609d7-10c4-51f6-84f2-b8444358ff6d"
 [[Glob]]
 git-tree-sha1 = "4df9f7e06108728ebf00a0a11edee4b29a482bb2"
 uuid = "c27321d9-0574-5035-807b-f59d2c89b15c"
 version = "1.3.0"
 [[HTTP]]
 deps = ["Base64", "Dates", "IniFile", "MbedTLS", "Sockets"]
 git-tree-sha1 = "c7ec02c4c6a039a98a15f955462cd7aea5df4508"
 uuid = "cd3eb016-35fb-5094-929b-558a96fad6f3"
 version = "0.8.19"
 [[HypergeometricFunctions]]
 deps = ["DualNumbers", "LinearAlgebra", "SpecialFunctions", "Test"]
 git-tree-sha1 = "65e4589030ef3c44d3b90bdc5aac462b4bb05567"
 uuid = "34004b35-14d8-5ef3-9330-4cdb6864b03a"
 version = "0.3.8"
 [[IniFile]]
 git-tree-sha1 = "f550e6e32074c939295eb5ea6de31849ac2c9625"
 uuid = "83e8ac13-25f8-5344-8a64-a9f2b223428f"
 version = "0.5.1"
 [[InlineStrings]]
 deps = ["Parsers"]
 git-tree-sha1 = "61feba885fac3a407465726d0c330b3055df897f"
 uuid = "842dd82b-1e85-43dc-bf29-5d0ee9dffc48"
 version = "1.1.2"
 [[InteractiveUtils]]
 deps = ["Markdown"]
 uuid = "b77e0a4c-d291-57a0-90e8-8db25a27a240"
 [[InverseFunctions]]
 deps = ["Test"]
 git-tree-sha1 = "a7254c0acd8e62f1ac75ad24d5db43f5f19f3c65"
 uuid = "3587e190-3f89-42d0-90ee-14403ec27112"
 version = "0.1.2"
 [[InvertedIndices]]
 git-tree-sha1 = "bee5f1ef5bf65df56bdd2e40447590b272a5471f"
 uuid = "41ab1584-1d38-5bbf-9106-f11c6c58b48f"
 version = "1.1.0"
 [[Ipopt_jll]]
 deps = ["ASL_jll", "Artifacts", "CompilerSupportLibraries_jll", "JLLWrappers", "Libdl", "MUMPS_seq_jll", "OpenBLAS32_jll", "Pkg"]
 git-tree-sha1 = "82124f27743f2802c23fcb05febc517d0b15d86e"
 uuid = "9cc047cb-c261-5740-88fc-0cf96f7bdcc7"
 version = "3.13.4+2"
 [[IrrationalConstants]]
 git-tree-sha1 = "7fd44fd4ff43fc60815f8e764c0f352b83c49151"
 uuid = "92d709cd-6900-40b7-9082-c6be49f344b6"
 version = "0.1.1"
 [[IteratorInterfaceExtensions]]
 git-tree-sha1 = "a3f24677c21f5bbe9d2a714f95dcd58337fb2856"
 uuid = "82899510-4779-5014-852e-03e436cf321d"
 version = "1.0.0"
 [[JLD2]]
 deps = ["FileIO", "MacroTools", "Mmap", "OrderedCollections", "Pkg", "Printf", "Reexport", "TranscodingStreams", "UUIDs"]
 git-tree-sha1 = "28b114b3279cdbac9a61c57b3e6548a572142b34"
 uuid = "033835bb-8acc-5ee8-8aae-3f567f8a3819"
 version = "0.4.21"
 [[JLLWrappers]]
 deps = ["Preferences"]
 git-tree-sha1 = "abc9885a7ca2052a736a600f7fa66209f96506e1"
 uuid = "692b3bcd-3c85-4b1f-b108-f13ce0eb3210"
 version = "1.4.1"
 [[JSON]]
 deps = ["Dates", "Mmap", "Parsers", "Unicode"]
 git-tree-sha1 = "3c837543ddb02250ef42f4738347454f95079d4e"
 uuid = "682c06a0-de6a-54ab-a142-c8b1cf79cde6"
 version = "0.21.3"
 [[JSONSchema]]
 deps = ["HTTP", "JSON", "ZipFile"]
 git-tree-sha1 = "b84ab8139afde82c7c65ba2b792fe12e01dd7307"
 uuid = "7d188eb4-7ad8-530c-ae41-71a32a6d4692"
 version = "0.3.3"
 [[JuMP]]
 deps = ["Calculus", "DataStructures", "ForwardDiff", "JSON", "LinearAlgebra", "MathOptInterface", "MutableArithmetics", "NaNMath", "Printf", "Random", "SparseArrays", "SpecialFunctions", "Statistics"]
 git-tree-sha1 = "4358b7cbf2db36596bdbbe3becc6b9d87e4eb8f5"
 uuid = "4076af6c-e467-56ae-b986-b466b2749572"
 version = "0.21.10"
 [[JuliaInterpreter]]
 deps = ["CodeTracking", "InteractiveUtils", "Random", "UUIDs"]
 git-tree-sha1 = "0a815f0060ab182f6c484b281107bfcd5bbb58dc"
 uuid = "aa1ae85d-cabe-5617-a682-6adf51b2e16a"
 version = "0.9.7"
 [[LazyArtifacts]]
 deps = ["Artifacts", "Pkg"]
 uuid = "4af54fe1-eca0-43a8-85a7-787d91b784e3"
 [[LibCURL]]
 deps = ["LibCURL_jll", "MozillaCACerts_jll"]
 uuid = "b27032c2-a3e7-50c8-80cd-2d36dbcbfd21"
 [[LibCURL_jll]]
 deps = ["Artifacts", "LibSSH2_jll", "Libdl", "MbedTLS_jll", "Zlib_jll", "nghttp2_jll"]
 uuid = "deac9b47-8bc7-5906-a0fe-35ac56dc84c0"
 [[LibGit2]]
 deps = ["Base64", "NetworkOptions", "Printf", "SHA"]
 uuid = "76f85450-5226-5b5a-8eaa-529ad045b433"
 [[LibSSH2_jll]]
 deps = ["Artifacts", "Libdl", "MbedTLS_jll"]
 uuid = "29816b5a-b9ab-546f-933c-edad1886dfa8"
 [[Libdl]]
 uuid = "8f399da3-3557-5675-b5ff-fb832c97cbdb"
 [[LinearAlgebra]]
 deps = ["Libdl"]
 uuid = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
 [[LogExpFunctions]]
 deps = ["ChainRulesCore", "ChangesOfVariables", "DocStringExtensions", "InverseFunctions", "IrrationalConstants", "LinearAlgebra"]
 git-tree-sha1 = "e5718a00af0ab9756305a0392832c8952c7426c1"
 uuid = "2ab3a3ac-af41-5b50-aa03-7779005ae688"
 version = "0.3.6"
 [[Logging]]
 uuid = "56ddb016-857b-54e1-b83d-db4d58db5568"
 [[LoweredCodeUtils]]
 deps = ["JuliaInterpreter"]
 git-tree-sha1 = "6b0440822974cab904c8b14d79743565140567f6"
 uuid = "6f1432cf-f94c-5a45-995e-cdbf5db27b0b"
 version = "2.2.1"
 [[METIS_jll]]
 deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"]
 git-tree-sha1 = "1d31872bb9c5e7ec1f618e8c4a56c8b0d9bddc7e"
 uuid = "d00139f3-1899-568f-a2f0-47f597d42d70"
 version = "5.1.1+0"
 [[MIPLearn]]
 deps = ["CSV", "Cbc", "Clp", "Conda", "DataFrames", "DataStructures", "Distributed", "JLD2", "JSON", "JuMP", "Logging", "MathOptInterface", "PackageCompiler", "Printf", "PyCall", "Random", "SparseArrays", "Statistics", "TimerOutputs"]
 path = "/home/isoron/Developer/MIPLearn.jl/dev"
 uuid = "2b1277c3-b477-4c49-a15e-7ba350325c68"
 version = "0.2.0"
 [[MUMPS_seq_jll]]
 deps = ["Artifacts", "CompilerSupportLibraries_jll", "JLLWrappers", "Libdl", "METIS_jll", "OpenBLAS32_jll", "Pkg"]
 git-tree-sha1 = "1a11a84b2af5feb5a62a820574804056cdc59c39"
 uuid = "d7ed1dd3-d0ae-5e8e-bfb4-87a502085b8d"
 version = "5.2.1+4"
 [[MacroTools]]
 deps = ["Markdown", "Random"]
 git-tree-sha1 = "3d3e902b31198a27340d0bf00d6ac452866021cf"
 uuid = "1914dd2f-81c6-5fcd-8719-6d5c9610ff09"
 version = "0.5.9"
 [[Markdown]]
 deps = ["Base64"]
 uuid = "d6f4376e-aef5-505a-96c1-9c027394607a"
 [[MathOptInterface]]
 deps = ["BenchmarkTools", "CodecBzip2", "CodecZlib", "JSON", "JSONSchema", "LinearAlgebra", "MutableArithmetics", "OrderedCollections", "SparseArrays", "Test", "Unicode"]
 git-tree-sha1 = "575644e3c05b258250bb599e57cf73bbf1062901"
 uuid = "b8f27783-ece8-5eb3-8dc8-9495eed66fee"
 version = "0.9.22"
 [[MbedTLS]]
 deps = ["Dates", "MbedTLS_jll", "Random", "Sockets"]
 git-tree-sha1 = "1c38e51c3d08ef2278062ebceade0e46cefc96fe"
 uuid = "739be429-bea8-5141-9913-cc70e7f3736d"
 version = "1.0.3"
 [[MbedTLS_jll]]
 deps = ["Artifacts", "Libdl"]
 uuid = "c8ffd9c3-330d-5841-b78e-0817d7145fa1"
 [[Missings]]
 deps = ["DataAPI"]
 git-tree-sha1 = "bf210ce90b6c9eed32d25dbcae1ebc565df2687f"
 uuid = "e1d29d7a-bbdc-5cf2-9ac0-f12de2c33e28"
 version = "1.0.2"
 [[Mmap]]
 uuid = "a63ad114-7e13-5084-954f-fe012c677804"
 [[MozillaCACerts_jll]]
 uuid = "14a3606d-f60d-562e-9121-12d972cd8159"
 [[MutableArithmetics]]
 deps = ["LinearAlgebra", "SparseArrays", "Test"]
 git-tree-sha1 = "8d9496b2339095901106961f44718920732616bb"
 uuid = "d8a4904e-b15c-11e9-3269-09a3773c0cb0"
 version = "0.2.22"
 [[NaNMath]]
 git-tree-sha1 = "b086b7ea07f8e38cf122f5016af580881ac914fe"
 uuid = "77ba4419-2d1f-58cd-9bb1-8ffee604a2e3"
 version = "0.3.7"
 [[NetworkOptions]]
 uuid = "ca575930-c2e3-43a9-ace4-1e988b2c1908"
 [[OpenBLAS32_jll]]
 deps = ["Artifacts", "CompilerSupportLibraries_jll", "JLLWrappers", "Libdl", "Pkg"]
 git-tree-sha1 = "ba4a8f683303c9082e84afba96f25af3c7fb2436"
 uuid = "656ef2d0-ae68-5445-9ca0-591084a874a2"
 version = "0.3.12+1"
 [[OpenLibm_jll]]
 deps = ["Artifacts", "Libdl"]
 uuid = "05823500-19ac-5b8b-9628-191a04bc5112"
 [[OpenSpecFun_jll]]
 deps = ["Artifacts", "CompilerSupportLibraries_jll", "JLLWrappers", "Libdl", "Pkg"]
 git-tree-sha1 = "13652491f6856acfd2db29360e1bbcd4565d04f1"
 uuid = "efe28fd5-8261-553b-a9e1-b2916fc3738e"
 version = "0.5.5+0"
 [[OrderedCollections]]
 git-tree-sha1 = "85f8e6578bf1f9ee0d11e7bb1b1456435479d47c"
 uuid = "bac558e1-5e72-5ebc-8fee-abe8a469f55d"
 version = "1.4.1"
 [[Osi_jll]]
 deps = ["Artifacts", "CoinUtils_jll", "CompilerSupportLibraries_jll", "JLLWrappers", "Libdl", "OpenBLAS32_jll", "Pkg"]
 git-tree-sha1 = "28e0ddebd069f605ab1988ab396f239a3ac9b561"
 uuid = "7da25872-d9ce-5375-a4d3-7a845f58efdd"
 version = "0.10800.600+0"
 [[PDMats]]
 deps = ["LinearAlgebra", "SparseArrays", "SuiteSparse"]
 git-tree-sha1 = "7e2166042d1698b6072352c74cfd1fca2a968253"
 uuid = "90014a1f-27ba-587c-ab20-58faa44d9150"
 version = "0.11.6"
 [[PackageCompiler]]
 deps = ["Artifacts", "LazyArtifacts", "Libdl", "Pkg", "Printf", "RelocatableFolders", "UUIDs"]
 git-tree-sha1 = "4ad92047603f8e955503f92767577b32508c39af"
 uuid = "9b87118b-4619-50d2-8e1e-99f35a4d4d9d"
 version = "2.0.5"
 [[Parsers]]
 deps = ["Dates"]
 git-tree-sha1 = "13468f237353112a01b2d6b32f3d0f80219944aa"
 uuid = "69de0a69-1ddd-5017-9359-2bf0b02dc9f0"
 version = "2.2.2"
 [[Pkg]]
 deps = ["Artifacts", "Dates", "Downloads", "LibGit2", "Libdl", "Logging", "Markdown", "Printf", "REPL", "Random", "SHA", "Serialization", "TOML", "Tar", "UUIDs", "p7zip_jll"]
 uuid = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f"
 [[PooledArrays]]
 deps = ["DataAPI", "Future"]
 git-tree-sha1 = "db3a23166af8aebf4db5ef87ac5b00d36eb771e2"
 uuid = "2dfb63ee-cc39-5dd5-95bd-886bf059d720"
 version = "1.4.0"
 [[Preferences]]
 deps = ["TOML"]
 git-tree-sha1 = "de893592a221142f3db370f48290e3a2ef39998f"
 uuid = "21216c6a-2e73-6563-6e65-726566657250"
 version = "1.2.4"
 [[PrettyTables]]
 deps = ["Crayons", "Formatting", "Markdown", "Reexport", "Tables"]
 git-tree-sha1 = "dfb54c4e414caa595a1f2ed759b160f5a3ddcba5"
 uuid = "08abe8d2-0d0c-5749-adfa-8a2ac140af0d"
 version = "1.3.1"
 [[Printf]]
 deps = ["Unicode"]
 uuid = "de0858da-6303-5e67-8744-51eddeeeb8d7"
 [[Profile]]
 deps = ["Printf"]
 uuid = "9abbd945-dff8-562f-b5e8-e1ebf5ef1b79"
 [[ProgressBars]]
 deps = ["Printf"]
 git-tree-sha1 = "938525cc66a4058f6ed75b84acd13a00fbecea11"
 uuid = "49802e3a-d2f1-5c88-81d8-b72133a6f568"
 version = "1.4.0"
 [[PyCall]]
 deps = ["Conda", "Dates", "Libdl", "LinearAlgebra", "MacroTools", "Serialization", "VersionParsing"]
 git-tree-sha1 = "71fd4022ecd0c6d20180e23ff1b3e05a143959c2"
 uuid = "438e738f-606a-5dbb-bf0a-cddfbfd45ab0"
 version = "1.93.0"
 [[QuadGK]]
 deps = ["DataStructures", "LinearAlgebra"]
 git-tree-sha1 = "78aadffb3efd2155af139781b8a8df1ef279ea39"
 uuid = "1fd47b50-473d-5c70-9696-f719f8f3bcdc"
 version = "2.4.2"
 [[REPL]]
 deps = ["InteractiveUtils", "Markdown", "Sockets", "Unicode"]
 uuid = "3fa0cd96-eef1-5676-8a61-b3b8758bbffb"
 [[Random]]
 deps = ["Serialization"]
 uuid = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
 [[Reexport]]
 git-tree-sha1 = "45e428421666073eab6f2da5c9d310d99bb12f9b"
 uuid = "189a3867-3050-52da-a836-e630ba90ab69"
 version = "1.2.2"
 [[RelocatableFolders]]
 deps = ["SHA", "Scratch"]
 git-tree-sha1 = "cdbd3b1338c72ce29d9584fdbe9e9b70eeb5adca"
 uuid = "05181044-ff0b-4ac5-8273-598c1e38db00"
 version = "0.1.3"
 [[Requires]]
 deps = ["UUIDs"]
 git-tree-sha1 = "838a3a4188e2ded87a4f9f184b4b0d78a1e91cb7"
 uuid = "ae029012-a4dd-5104-9daa-d747884805df"
 version = "1.3.0"
 [[Revise]]
 deps = ["CodeTracking", "Distributed", "FileWatching", "JuliaInterpreter", "LibGit2", "LoweredCodeUtils", "OrderedCollections", "Pkg", "REPL", "Requires", "UUIDs", "Unicode"]
 git-tree-sha1 = "606ddc4d3d098447a09c9337864c73d017476424"
 uuid = "295af30f-e4ad-537b-8983-00126c2a3abe"
 version = "3.3.2"
 [[Rmath]]
 deps = ["Random", "Rmath_jll"]
 git-tree-sha1 = "bf3188feca147ce108c76ad82c2792c57abe7b1f"
 uuid = "79098fc4-a85e-5d69-aa6a-4863f24498fa"
 version = "0.7.0"
 [[Rmath_jll]]
 deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"]
 git-tree-sha1 = "68db32dff12bb6127bac73c209881191bf0efbb7"
 uuid = "f50d1b31-88e8-58de-be2c-1cc44531875f"
 version = "0.3.0+0"
 [[SCIP]]
 deps = ["Ipopt_jll", "Libdl", "MathOptInterface", "SCIP_jll"]
 git-tree-sha1 = "6b799e6a23746633f94f4f10a9ac234f8b86f680"
 repo-rev = "7aa79aaa"
 repo-url = "https://github.com/scipopt/SCIP.jl.git"
 uuid = "82193955-e24f-5292-bf16-6f2c5261a85f"
 version = "0.9.8"
 [[SCIP_jll]]
 deps = ["Artifacts", "CompilerSupportLibraries_jll", "GMP_jll", "Ipopt_jll", "JLLWrappers", "Libdl", "Pkg", "Zlib_jll", "bliss_jll"]
 git-tree-sha1 = "83d35a061885aa73491aa2f8db28310214bbd521"
 uuid = "e5ac4fe4-a920-5659-9bf8-f9f73e9e79ce"
 version = "0.1.3+0"
 [[SHA]]
 uuid = "ea8e919c-243c-51af-8825-aaa63cd721ce"
 [[Scratch]]
 deps = ["Dates"]
 git-tree-sha1 = "0b4b7f1393cff97c33891da2a0bf69c6ed241fda"
 uuid = "6c6a2e73-6563-6170-7368-637461726353"
 version = "1.1.0"
 [[SentinelArrays]]
 deps = ["Dates", "Random"]
 git-tree-sha1 = "6a2f7d70512d205ca8c7ee31bfa9f142fe74310c"
 uuid = "91c51154-3ec4-41a3-a24f-3f23e20d615c"
 version = "1.3.12"
 [[Serialization]]
 uuid = "9e88b42a-f829-5b0c-bbe9-9e923198166b"
 [[SharedArrays]]
 deps = ["Distributed", "Mmap", "Random", "Serialization"]
 uuid = "1a1011a3-84de-559e-8e89-a11a2f7dc383"
 [[Sockets]]
 uuid = "6462fe0b-24de-5631-8697-dd941f90decc"
 [[SortingAlgorithms]]
 deps = ["DataStructures"]
 git-tree-sha1 = "b3363d7460f7d098ca0912c69b082f75625d7508"
 uuid = "a2af1166-a08f-5f64-846c-94a0d3cef48c"
 version = "1.0.1"
 [[SparseArrays]]
 deps = ["LinearAlgebra", "Random"]
 uuid = "2f01184e-e22b-5df5-ae63-d93ebab69eaf"
 [[SpecialFunctions]]
 deps = ["ChainRulesCore", "IrrationalConstants", "LogExpFunctions", "OpenLibm_jll", "OpenSpecFun_jll"]
 git-tree-sha1 = "cbf21db885f478e4bd73b286af6e67d1beeebe4c"
 uuid = "276daf66-3868-5448-9aa4-cd146d93841b"
 version = "1.8.4"
 [[StaticArrays]]
 deps = ["LinearAlgebra", "Random", "Statistics"]
 git-tree-sha1 = "6354dfaf95d398a1a70e0b28238321d5d17b2530"
 uuid = "90137ffa-7385-5640-81b9-e52037218182"
 version = "1.4.0"
 [[Statistics]]
 deps = ["LinearAlgebra", "SparseArrays"]
 uuid = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"
 [[StatsAPI]]
 deps = ["LinearAlgebra"]
 git-tree-sha1 = "c3d8ba7f3fa0625b062b82853a7d5229cb728b6b"
 uuid = "82ae8749-77ed-4fe6-ae5f-f523153014b0"
 version = "1.2.1"
 [[StatsBase]]
 deps = ["DataAPI", "DataStructures", "LinearAlgebra", "LogExpFunctions", "Missings", "Printf", "Random", "SortingAlgorithms", "SparseArrays", "Statistics", "StatsAPI"]
 git-tree-sha1 = "8977b17906b0a1cc74ab2e3a05faa16cf08a8291"
 uuid = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"
 version = "0.33.16"
 [[StatsFuns]]
 deps = ["ChainRulesCore", "HypergeometricFunctions", "InverseFunctions", "IrrationalConstants", "LogExpFunctions", "Reexport", "Rmath", "SpecialFunctions"]
 git-tree-sha1 = "25405d7016a47cf2bd6cd91e66f4de437fd54a07"
 uuid = "4c63d2b9-4356-54db-8cca-17b64c39e42c"
 version = "0.9.16"
 [[SuiteSparse]]
 deps = ["Libdl", "LinearAlgebra", "Serialization", "SparseArrays"]
 uuid = "4607b0f0-06f3-5cda-b6b1-a6196a1729e9"
 [[TOML]]
 deps = ["Dates"]
 uuid = "fa267f1f-6049-4f14-aa54-33bafae1ed76"
 [[TableTraits]]
 deps = ["IteratorInterfaceExtensions"]
 git-tree-sha1 = "c06b2f539df1c6efa794486abfb6ed2022561a39"
 uuid = "3783bdb8-4a98-5b6b-af9a-565f29a5fe9c"
 version = "1.0.1"
 [[Tables]]
 deps = ["DataAPI", "DataValueInterfaces", "IteratorInterfaceExtensions", "LinearAlgebra", "TableTraits", "Test"]
 git-tree-sha1 = "bb1064c9a84c52e277f1096cf41434b675cd368b"
 uuid = "bd369af6-aec1-5ad0-b16a-f7cc5008161c"
 version = "1.6.1"
 [[Tar]]
 deps = ["ArgTools", "SHA"]
 uuid = "a4e569a6-e804-4fa4-b0f3-eef7a1d5b13e"
 [[Test]]
 deps = ["InteractiveUtils", "Logging", "Random", "Serialization"]
 uuid = "8dfed614-e22c-5e08-85e1-65c5234f0b40"
 [[TimerOutputs]]
 deps = ["ExprTools", "Printf"]
 git-tree-sha1 = "97e999be94a7147d0609d0b9fc9feca4bf24d76b"
 uuid = "a759f4b9-e2f1-59dc-863e-4aeb61b1ea8f"
 version = "0.5.15"
 [[TranscodingStreams]]
 deps = ["Random", "Test"]
 git-tree-sha1 = "216b95ea110b5972db65aa90f88d8d89dcb8851c"
 uuid = "3bb67fe8-82b1-5028-8e26-92a6c54297fa"
 version = "0.9.6"
 [[UUIDs]]
 deps = ["Random", "SHA"]
 uuid = "cf7118a7-6976-5b1a-9a39-7adc72f591a4"
 [[Unicode]]
 uuid = "4ec0a83e-493e-50e2-b9ac-8f72acf5a8f5"
 [[VersionParsing]]
 git-tree-sha1 = "58d6e80b4ee071f5efd07fda82cb9fbe17200868"
 uuid = "81def892-9a0e-5fdd-b105-ffc91e053289"
 version = "1.3.0"
 [[WeakRefStrings]]
 deps = ["DataAPI", "InlineStrings", "Parsers"]
 git-tree-sha1 = "c69f9da3ff2f4f02e811c3323c22e5dfcb584cfa"
 uuid = "ea10d353-3f73-51f8-a26c-33c1cb351aa5"
 version = "1.4.1"
 [[ZipFile]]
 deps = ["Libdl", "Printf", "Zlib_jll"]
 git-tree-sha1 = "3593e69e469d2111389a9bd06bac1f3d730ac6de"
 uuid = "a5390f91-8eb1-5f08-bee0-b1d1ffed6cea"
 version = "0.9.4"
 [[Zlib_jll]]
 deps = ["Libdl"]
 uuid = "83775a58-1f1d-513f-b197-d71354ab007a"
 [[bliss_jll]]
 deps = ["Artifacts", "GMP_jll", "JLLWrappers", "Libdl", "Pkg"]
 git-tree-sha1 = "f8b75e896a326a162a4f6e998990521d8302c810"
 uuid = "508c9074-7a14-5c94-9582-3d4bc1871065"
 version = "0.77.0+1"
 [[nghttp2_jll]]
 deps = ["Artifacts", "Libdl"]
 uuid = "8e850ede-7688-5339-a07c-302acd2aaf8d"
 [[p7zip_jll]]
 deps = ["Artifacts", "Libdl"]
 uuid = "3f19e933-33d8-53b3-aaab-bd5110c3b7a0"
--- a/docs/jump-tutorials/Project.toml
+++ b/docs/jump-tutorials/Project.toml
@@ -1,9 +0,0 @@
 [deps]
 Cbc = "9961bab8-2fa3-5c5a-9d89-47fab24efd76"
 Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"
 Glob = "c27321d9-0574-5035-807b-f59d2c89b15c"
 JuMP = "4076af6c-e467-56ae-b986-b466b2749572"
 MIPLearn = "2b1277c3-b477-4c49-a15e-7ba350325c68"
 ProgressBars = "49802e3a-d2f1-5c88-81d8-b72133a6f568"
 Revise = "295af30f-e4ad-537b-8983-00126c2a3abe"
 SCIP = "82193955-e24f-5292-bf16-6f2c5261a85f"
--- a/docs/jump-tutorials/customizing-ml.ipynb
+++ b/docs/jump-tutorials/customizing-ml.ipynb
@@ -1,29 +0,0 @@
 {
 "cells": [
  {
   "cell_type": "markdown",
   "id": "ea2dc06a",
   "metadata": {},
   "source": [
    "# Customizing the ML models\n",
    "\n",
    "TODO"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Julia 1.6.0",
   "language": "julia",
   "name": "julia-1.6"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "1.6.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
 }
--- a/docs/jump-tutorials/getting-started.ipynb
+++ b/docs/jump-tutorials/getting-started.ipynb
@@ -1,691 +0,0 @@
 {
 "cells": [
  {
   "cell_type": "markdown",
   "id": "9b0c4eed",
   "metadata": {},
   "source": [
    "# Getting started\n",
    "\n",
    "## Introduction\n",
    "\n",
    "**MIPLearn** is an open source framework that uses machine learning (ML) to accelerate the performance of both commercial and open source mixed-integer programming solvers (e.g. Gurobi, CPLEX, XPRESS, Cbc or SCIP). In this tutorial, we will:\n",
    "\n",
    "1. Install the Julia/JuMP version of MIPLearn\n",
    "2. Model a simple optimization problem using JuMP\n",
    "3. Generate training data and train the ML models\n",
    "4. Use the ML models together with Gurobi to solve new instances\n",
    "\n",
    "<div class=\"alert alert-warning\">\n",
    "Warning\n",
    "    \n",
    "MIPLearn is still in early development stage. If run into any bugs or issues, please submit a bug report in our GitHub repository. Comments, suggestions and pull requests are also very welcome!\n",
    "    \n",
    "</div>\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f0d159b8",
   "metadata": {},
   "source": [
    "## Installation\n",
    "\n",
    "MIPLearn is available in two versions:\n",
    "\n",
    "- Python version, compatible with the Pyomo modeling language,\n",
    "- Julia version, compatible with the JuMP modeling language.\n",
    "\n",
    "In this tutorial, we will demonstrate how to use and install the Julia/JuMP version of the package. The first step is to install the Julia programming language in your computer. [See the official instructions for more details](https://julialang.org/downloads/). Note that MIPLearn was developed and tested with Julia 1.6, and may not be compatible with newer versions of the language. After Julia is installed, launch its console and run the following commands to download and install the package:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "b16685be",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Path `/home/axavier/Packages/MIPLearn.jl/dev` exists and looks like the correct package. Using existing path.\n",
      "\u001b[32m\u001b[1m   Resolving\u001b[22m\u001b[39m package versions...\n",
      "\u001b[32m\u001b[1m  No Changes\u001b[22m\u001b[39m to `~/Packages/MIPLearn/dev/docs/jump-tutorials/Project.toml`\n",
      "\u001b[32m\u001b[1m  No Changes\u001b[22m\u001b[39m to `~/Packages/MIPLearn/dev/docs/jump-tutorials/Manifest.toml`\n"
     ]
    }
   ],
   "source": [
    "using Pkg\n",
    "#Pkg.add(PackageSpec(url=\"https://github.com/ANL-CEEESA/MIPLearn.jl.git\"))\n",
    "Pkg.develop(PackageSpec(path=\"/home/axavier/Packages/MIPLearn.jl/dev\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e5ed7716",
   "metadata": {},
   "source": [
    "In addition to MIPLearn itself, we will also install a few other packages that are required for this tutorial:\n",
    "\n",
    "- [**Gurobi**](https://www.gurobi.com/), a state-of-the-art MIP solver\n",
    "- [**JuMP**](https://jump.dev/), an open source modeling language for Julia\n",
    "- [**Distributions.jl**](https://github.com/JuliaStats/Distributions.jl), a statistics package that we will use to generate random inputs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "f88155c5",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\u001b[32m\u001b[1m    Updating\u001b[22m\u001b[39m registry at `~/.julia/registries/General`\n",
      "\u001b[32m\u001b[1m    Updating\u001b[22m\u001b[39m git-repo `https://github.com/JuliaRegistries/General.git`\n",
      "\u001b[32m\u001b[1m   Resolving\u001b[22m\u001b[39m package versions...\n",
      "\u001b[32m\u001b[1m  No Changes\u001b[22m\u001b[39m to `~/Packages/MIPLearn/dev/docs/jump-tutorials/Project.toml`\n",
      "\u001b[32m\u001b[1m  No Changes\u001b[22m\u001b[39m to `~/Packages/MIPLearn/dev/docs/jump-tutorials/Manifest.toml`\n"
     ]
    }
   ],
   "source": [
    "using Pkg\n",
    "Pkg.add([\n",
    "    PackageSpec(name=\"Gurobi\", version=\"0.9.14\"),\n",
    "    PackageSpec(name=\"JuMP\", version=\"0.21\"),\n",
    "    PackageSpec(name=\"Distributions\", version=\"0.25\"),\n",
    "    PackageSpec(name=\"Glob\", version=\"1\"),\n",
    "])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a0e1dda5",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-info\">\n",
    "    \n",
    "Note\n",
    "    \n",
    "In the code above, we install specific version of all packages to ensure that this tutorial keeps running in the future, even when newer (and possibly incompatible) versions of the packages are released. This is usually a recommended practice for all Julia projects.\n",
    "    \n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "378b6a97",
   "metadata": {},
   "source": [
    "## Modeling a simple optimization problem\n",
    "\n",
    "To illustrate how can MIPLearn be used, we will model and solve a small optimization problem related to power systems optimization. The problem we discuss below is a simplification of the **unit commitment problem,** a practical optimization problem solved daily by electric grid operators around the world. \n",
    "\n",
    "Suppose that you work at a utility company, and that it is your job to decide which electrical generators should be online at a certain hour of the day, as well as how much power should each generator produce. More specifically, assume that your company owns $n$ generators, denoted by $g_1, \\ldots, g_n$. Each generator can either be online or offline. An online generator $g_i$ can produce between $p^\\text{min}_i$ to $p^\\text{max}_i$ megawatts of power, and it costs your company $c^\\text{fix}_i + c^\\text{var}_i y_i$, where $y_i$ is the amount of power produced. An offline generator produces nothing and costs nothing. You also know that the total amount of power to be produced needs to be exactly equal to the total demand $d$ (in megawatts). To minimize the costs to your company, which generators should be online, and how much power should they produce?\n",
    "\n",
    "This simple problem can be modeled as a *mixed-integer linear optimization* problem as follows. For each generator $g_i$, let $x_i \\in \\{0,1\\}$ be a decision variable indicating whether $g_i$ is online, and let $y_i \\geq 0$ be a decision variable indicating how much power does $g_i$ produce. The problem is then given by:\n",
    "\n",
    "$$\n",
    "\\begin{align}\n",
    "\\text{minimize } \\quad & \\sum_{i=1}^n \\left( c^\\text{fix}_i x_i + c^\\text{var}_i y_i \\right) \\\\\n",
    "\\text{subject to } \\quad & y_i \\leq p^\\text{max}_i x_i & i=1,\\ldots,n \\\\\n",
    "& y_i \\geq p^\\text{min}_i x_i & i=1,\\ldots,n \\\\\n",
    "& \\sum_{i=1}^n y_i = d \\\\\n",
    "& x_i \\in \\{0,1\\} & i=1,\\ldots,n \\\\\n",
    "& y_i \\geq 0 & i=1,\\ldots,n\n",
    "\\end{align}\n",
    "$$\n",
    "\n",
    "<div class=\"alert alert-info\">\n",
    "    \n",
    "Note\n",
    "    \n",
    "We use a simplified version of the unit commitment problem in this tutorial just to make it easier to follow. MIPLearn can also handle realistic, large-scale versions of this problem. See benchmarks for more details.\n",
    "    \n",
    "</div>\n",
    "\n",
    "Next, let us convert this abstract mathematical formulation into a concrete optimization model, using Julia and JuMP. We start by defining a data structure that holds all the input data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "798b2f6c",
   "metadata": {},
   "outputs": [],
   "source": [
    "Base.@kwdef struct UnitCommitmentData\n",
    "    demand::Float64\n",
    "    pmin::Vector{Float64}\n",
    "    pmax::Vector{Float64}\n",
    "    cfix::Vector{Float64}\n",
    "    cvar::Vector{Float64}\n",
    "end;"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "104b709a",
   "metadata": {},
   "source": [
    "Next, we create a function that converts this data structure into a concrete JuMP model. For more details on the JuMP syntax, see [the official JuMP documentation](https://jump.dev/JuMP.jl/stable/)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "a7c048e4",
   "metadata": {},
   "outputs": [],
   "source": [
    "using JuMP\n",
    "\n",
    "function build_uc_model(data::UnitCommitmentData)::Model\n",
    "    model = Model()\n",
    "    n = length(data.pmin)\n",
    "    @variable(model, x[1:n], Bin)\n",
    "    @variable(model, y[1:n] >= 0)\n",
    "    @objective(\n",
    "        model,\n",
    "        Min,\n",
    "        sum(\n",
    "            data.cfix[i] * x[i] +\n",
    "            data.cvar[i] * y[i]\n",
    "            for i in 1:n\n",
    "        )\n",
    "    )\n",
    "    @constraint(model, eq_max_power[i in 1:n], y[i] <= data.pmax[i] * x[i])\n",
    "    @constraint(model, eq_min_power[i in 1:n], y[i] >= data.pmin[i] * x[i])\n",
    "    @constraint(model, eq_demand, sum(y[i] for i in 1:n) == data.demand)\n",
    "    return model\n",
    "end;"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5f10142e",
   "metadata": {},
   "source": [
    "At this point, we can already use JuMP and any mixed-integer linear programming solver to find optimal solutions to any instance of this problem. To illustrate this, let us solve a small instance with three generators, using SCIP:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "bc2022a4",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "obj = 1320.0\n",
      "  x = [0.0, 1.0, 1.0]\n",
      "  y = [0.0, 60.0, 40.0]\n"
     ]
    }
   ],
   "source": [
    "using Gurobi\n",
    "\n",
    "model = build_uc_model(\n",
    "    UnitCommitmentData(\n",
    "        demand = 100.0,\n",
    "        pmin = [10, 20, 30],\n",
    "        pmax = [50, 60, 70],\n",
    "        cfix = [700, 600, 500],\n",
    "        cvar = [1.5, 2.0, 2.5],\n",
    "    )\n",
    ")\n",
    "\n",
    "gurobi = optimizer_with_attributes(Gurobi.Optimizer, \"Threads\" => 1, \"Seed\" => 42)\n",
    "set_optimizer(model, gurobi)\n",
    "set_silent(model)\n",
    "optimize!(model)\n",
    "\n",
    "println(\"obj = \", objective_value(model))\n",
    "println(\"  x = \", round.(value.(model[:x])))\n",
    "println(\"  y = \", round.(value.(model[:y]), digits=2));"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9ee6958b",
   "metadata": {},
   "source": [
    "Running the code above, we found that the optimal solution for our small problem instance costs \\$1320. It is achieve by keeping generators 2 and 3 online and producing, respectively, 60 MW and 40 MW of power."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f34e3d44",
   "metadata": {},
   "source": [
    "## Generating training data\n",
    "\n",
    "Although SCIP could solve the small example above in a fraction of a second, it gets slower for larger and more complex versions of the problem. If this is a problem that needs to be solved frequently, as it is often the case in practice, it could make sense to spend some time upfront generating a **trained** version of SCIP, which can solve new instances (similar to the ones it was trained on) faster.\n",
    "\n",
    "In the following, we will use MIPLearn to train machine learning models that can be used to accelerate SCIP's performance on a particular set of instances. More specifically, MIPLearn will train a model that is able to predict the optimal solution for instances that follow a given probability distribution, then it will provide this predicted solution to SCIP as a warm start.\n",
    "\n",
    "Before we can train the model, we need to collect training data by solving a large number of instances. In real-world situations, we may construct these training instances based on historical data. In this tutorial, we will construct them using a random instance generator:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "a498e1e1",
   "metadata": {},
   "outputs": [],
   "source": [
    "using Distributions\n",
    "using Random\n",
    "\n",
    "function random_uc_data(; samples::Int, n::Int, seed=42)\n",
    "    Random.seed!(seed)\n",
    "    pmin = rand(Uniform(100, 500.0), n)\n",
    "    pmax = pmin .* rand(Uniform(2.0, 2.5), n)\n",
    "    cfix = pmin .* rand(Uniform(100.0, 125.0), n)\n",
    "    cvar = rand(Uniform(1.25, 1.5), n)\n",
    "    return [\n",
    "        UnitCommitmentData(;\n",
    "            pmin,\n",
    "            pmax,\n",
    "            cfix,\n",
    "            cvar,\n",
    "            demand = sum(pmax) * rand(Uniform(0.5, 0.75)),\n",
    "        )\n",
    "        for i in 1:samples\n",
    "    ]\n",
    "end;"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e33bb12c",
   "metadata": {},
   "source": [
    "In this example, for simplicity, only the demands change from one instance to the next. We could also have randomized the costs, production limits or even the number of units. The more randomization we have in the training data, however, the more challenging it is for the machine learning models to learn solution patterns.\n",
    "\n",
    "Now we generate 500 instances of this problem, each one with 50 generators, and we use 450 of these instances for training. After generating the instances, we write them to individual files. MIPLearn uses files during the training process because, for large-scale optimization problems, it is often impractical to hold in memory the entire training data, as well as the concrete JuMP models. Files also make it much easier to solve multiple instances simultaneously, potentially even on multiple machines. We will cover parallel and distributed computing in a future tutorial. The code below generates the files `uc/train/00001.jld2`, `uc/train/00002.jld2`, etc., which contain the input data in [JLD2 format](https://github.com/JuliaIO/JLD2.jl)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "5358a046",
   "metadata": {},
   "outputs": [],
   "source": [
    "using MIPLearn\n",
    "data = random_uc_data(samples=500, n=50);\n",
    "train_files = MIPLearn.save(data[1:450], \"uc/train/\")\n",
    "test_files  = MIPLearn.save(data[451:500], \"uc/test/\");"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "38a27d1c",
   "metadata": {},
   "source": [
    "Finally, we use `LearningSolver` to solve all the training instances. `LearningSolver` is the main component provided by MIPLearn, which integrates MIP solvers and ML. The optimal solutions, along with other useful training data, are stored in HDF5 files `uc/train/00001.h5`, `uc/train/00002.h5`, etc."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "c341b12d",
   "metadata": {},
   "outputs": [],
   "source": [
    "solver = LearningSolver(gurobi)\n",
    "solve!(solver, train_files, build_uc_model);"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "189b4f60",
   "metadata": {},
   "source": [
    "## Solving new instances\n",
    "\n",
    "With training data in hand, we can now fit the ML models using `MIPLearn.fit!`, then solve the test instances with `MIPLearn.solve!`, as shown below. The `tee=true` parameter asks MIPLearn to print the solver log to the screen."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "1cf11450",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Gurobi Optimizer version 9.1.1 build v9.1.1rc0 (linux64)\n",
      "Thread count: 16 physical cores, 32 logical processors, using up to 1 threads\n",
      "Optimize a model with 101 rows, 100 columns and 250 nonzeros\n",
      "Model fingerprint: 0xfb382c05\n",
      "Coefficient statistics:\n",
      "  Matrix range     [1e+00, 1e+03]\n",
      "  Objective range  [1e+00, 6e+04]\n",
      "  Bounds range     [1e+00, 1e+00]\n",
      "  RHS range        [2e+04, 2e+04]\n",
      "Presolve removed 100 rows and 50 columns\n",
      "Presolve time: 0.00s\n",
      "Presolved: 1 rows, 50 columns, 50 nonzeros\n",
      "\n",
      "Iteration    Objective       Primal Inf.    Dual Inf.      Time\n",
      "       0    7.0629410e+05   6.782322e+02   0.000000e+00      0s\n",
      "       1    8.0678161e+05   0.000000e+00   0.000000e+00      0s\n",
      "\n",
      "Solved in 1 iterations and 0.00 seconds\n",
      "Optimal objective  8.067816095e+05\n",
      "\n",
      "User-callback calls 33, time in user-callback 0.00 sec\n",
      "\n",
      "Gurobi Optimizer version 9.1.1 build v9.1.1rc0 (linux64)\n",
      "Thread count: 16 physical cores, 32 logical processors, using up to 1 threads\n",
      "Optimize a model with 101 rows, 100 columns and 250 nonzeros\n",
      "Model fingerprint: 0x7bb6bbd6\n",
      "Variable types: 50 continuous, 50 integer (50 binary)\n",
      "Coefficient statistics:\n",
      "  Matrix range     [1e+00, 1e+03]\n",
      "  Objective range  [1e+00, 6e+04]\n",
      "  Bounds range     [1e+00, 1e+00]\n",
      "  RHS range        [2e+04, 2e+04]\n",
      "\n",
      "User MIP start produced solution with objective 822175 (0.00s)\n",
      "User MIP start produced solution with objective 812767 (0.00s)\n",
      "User MIP start produced solution with objective 811628 (0.00s)\n",
      "User MIP start produced solution with objective 809648 (0.01s)\n",
      "User MIP start produced solution with objective 808536 (0.01s)\n",
      "Loaded user MIP start with objective 808536\n",
      "\n",
      "Presolve time: 0.00s\n",
      "Presolved: 101 rows, 100 columns, 250 nonzeros\n",
      "Variable types: 50 continuous, 50 integer (50 binary)\n",
      "\n",
      "Root relaxation: objective 8.067816e+05, 55 iterations, 0.00 seconds\n",
      "\n",
      "    Nodes    |    Current Node    |     Objective Bounds      |     Work\n",
      " Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time\n",
      "\n",
      "     0     0 806781.610    0    1 808536.496 806781.610  0.22%     -    0s\n",
      "H    0     0                    808091.02482 806781.610  0.16%     -    0s\n",
      "     0     0 807198.955    0    2 808091.025 807198.955  0.11%     -    0s\n",
      "     0     0 807198.955    0    1 808091.025 807198.955  0.11%     -    0s\n",
      "     0     0 807198.955    0    2 808091.025 807198.955  0.11%     -    0s\n",
      "     0     0 807226.059    0    3 808091.025 807226.059  0.11%     -    0s\n",
      "     0     0 807240.578    0    5 808091.025 807240.578  0.11%     -    0s\n",
      "     0     0 807240.663    0    5 808091.025 807240.663  0.11%     -    0s\n",
      "     0     0 807259.825    0    4 808091.025 807259.825  0.10%     -    0s\n",
      "     0     0 807275.314    0    5 808091.025 807275.314  0.10%     -    0s\n",
      "     0     0 807279.037    0    6 808091.025 807279.037  0.10%     -    0s\n",
      "     0     0 807291.881    0    8 808091.025 807291.881  0.10%     -    0s\n",
      "     0     0 807325.323    0    6 808091.025 807325.323  0.09%     -    0s\n",
      "     0     0 807326.015    0    7 808091.025 807326.015  0.09%     -    0s\n",
      "     0     0 807326.798    0    7 808091.025 807326.798  0.09%     -    0s\n",
      "     0     0 807328.550    0    8 808091.025 807328.550  0.09%     -    0s\n",
      "     0     0 807331.193    0    9 808091.025 807331.193  0.09%     -    0s\n",
      "     0     0 807332.143    0    7 808091.025 807332.143  0.09%     -    0s\n",
      "     0     0 807335.410    0    8 808091.025 807335.410  0.09%     -    0s\n",
      "     0     0 807335.452    0    8 808091.025 807335.452  0.09%     -    0s\n",
      "     0     0 807337.253    0    9 808091.025 807337.253  0.09%     -    0s\n",
      "     0     0 807337.409    0    9 808091.025 807337.409  0.09%     -    0s\n",
      "     0     0 807347.720    0    8 808091.025 807347.720  0.09%     -    0s\n",
      "     0     0 807352.765    0    7 808091.025 807352.765  0.09%     -    0s\n",
      "     0     0 807366.618    0    9 808091.025 807366.618  0.09%     -    0s\n",
      "     0     0 807368.345    0   10 808091.025 807368.345  0.09%     -    0s\n",
      "     0     0 807369.195    0   10 808091.025 807369.195  0.09%     -    0s\n",
      "     0     0 807392.319    0    8 808091.025 807392.319  0.09%     -    0s\n",
      "     0     0 807401.436    0    9 808091.025 807401.436  0.09%     -    0s\n",
      "     0     0 807405.685    0    8 808091.025 807405.685  0.08%     -    0s\n",
      "     0     0 807411.994    0    8 808091.025 807411.994  0.08%     -    0s\n",
      "     0     0 807424.710    0    9 808091.025 807424.710  0.08%     -    0s\n",
      "     0     0 807424.867    0   11 808091.025 807424.867  0.08%     -    0s\n",
      "     0     0 807427.428    0   12 808091.025 807427.428  0.08%     -    0s\n",
      "     0     0 807433.211    0   10 808091.025 807433.211  0.08%     -    0s\n",
      "     0     0 807439.215    0   10 808091.025 807439.215  0.08%     -    0s\n",
      "     0     0 807439.303    0   11 808091.025 807439.303  0.08%     -    0s\n",
      "     0     0 807443.312    0   11 808091.025 807443.312  0.08%     -    0s\n",
      "     0     0 807444.488    0   12 808091.025 807444.488  0.08%     -    0s\n",
      "     0     0 807444.499    0   13 808091.025 807444.499  0.08%     -    0s\n",
      "     0     0 807444.499    0   13 808091.025 807444.499  0.08%     -    0s\n",
      "     0     2 807445.982    0   13 808091.025 807445.982  0.08%     -    0s\n",
      "\n",
      "Cutting planes:\n",
      "  Cover: 3\n",
      "  MIR: 18\n",
      "  StrongCG: 1\n",
      "  Flow cover: 3\n",
      "\n",
      "Explored 39 nodes (333 simplex iterations) in 0.03 seconds\n",
      "Thread count was 1 (of 32 available processors)\n",
      "\n",
      "Solution count 6: 808091 808536 809648 ... 822175\n",
      "\n",
      "Optimal solution found (tolerance 1.00e-04)\n",
      "Best objective 8.080910248225e+05, best bound 8.080640878016e+05, gap 0.0033%\n",
      "\n",
      "User-callback calls 341, time in user-callback 0.00 sec\n",
      "\n"
     ]
    }
   ],
   "source": [
    "solver_ml = LearningSolver(gurobi)\n",
    "fit!(solver_ml, train_files, build_uc_model)\n",
    "solve!(solver_ml, test_files[1], build_uc_model, tee=true);"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "872211e7",
   "metadata": {},
   "source": [
    "By examining the solve log above, specifically the line `Loaded user MIP start with objective...`, we can see that MIPLearn was able to construct an initial solution which turned out to be near optimal for the problem. Now let us repeat the code above, but using an untrained solver. Note that the `fit` line is omitted."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "fc1e3629",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Gurobi Optimizer version 9.1.1 build v9.1.1rc0 (linux64)\n",
      "Thread count: 16 physical cores, 32 logical processors, using up to 1 threads\n",
      "Optimize a model with 101 rows, 100 columns and 250 nonzeros\n",
      "Model fingerprint: 0xfb382c05\n",
      "Coefficient statistics:\n",
      "  Matrix range     [1e+00, 1e+03]\n",
      "  Objective range  [1e+00, 6e+04]\n",
      "  Bounds range     [1e+00, 1e+00]\n",
      "  RHS range        [2e+04, 2e+04]\n",
      "Presolve removed 100 rows and 50 columns\n",
      "Presolve time: 0.00s\n",
      "Presolved: 1 rows, 50 columns, 50 nonzeros\n",
      "\n",
      "Iteration    Objective       Primal Inf.    Dual Inf.      Time\n",
      "       0    7.0629410e+05   6.782322e+02   0.000000e+00      0s\n",
      "       1    8.0678161e+05   0.000000e+00   0.000000e+00      0s\n",
      "\n",
      "Solved in 1 iterations and 0.00 seconds\n",
      "Optimal objective  8.067816095e+05\n",
      "\n",
      "User-callback calls 33, time in user-callback 0.00 sec\n",
      "\n",
      "Gurobi Optimizer version 9.1.1 build v9.1.1rc0 (linux64)\n",
      "Thread count: 16 physical cores, 32 logical processors, using up to 1 threads\n",
      "Optimize a model with 101 rows, 100 columns and 250 nonzeros\n",
      "Model fingerprint: 0x899aac3d\n",
      "Variable types: 50 continuous, 50 integer (50 binary)\n",
      "Coefficient statistics:\n",
      "  Matrix range     [1e+00, 1e+03]\n",
      "  Objective range  [1e+00, 6e+04]\n",
      "  Bounds range     [1e+00, 1e+00]\n",
      "  RHS range        [2e+04, 2e+04]\n",
      "Found heuristic solution: objective 893073.33620\n",
      "Presolve time: 0.00s\n",
      "Presolved: 101 rows, 100 columns, 250 nonzeros\n",
      "Variable types: 50 continuous, 50 integer (50 binary)\n",
      "\n",
      "Root relaxation: objective 8.067816e+05, 55 iterations, 0.00 seconds\n",
      "\n",
      "    Nodes    |    Current Node    |     Objective Bounds      |     Work\n",
      " Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time\n",
      "\n",
      "     0     0 806781.610    0    1 893073.336 806781.610  9.66%     -    0s\n",
      "H    0     0                    842766.25007 806781.610  4.27%     -    0s\n",
      "H    0     0                    818273.05208 806781.610  1.40%     -    0s\n",
      "     0     0 807198.955    0    2 818273.052 807198.955  1.35%     -    0s\n",
      "H    0     0                    813499.43980 807198.955  0.77%     -    0s\n",
      "     0     0 807246.085    0    3 813499.440 807246.085  0.77%     -    0s\n",
      "     0     0 807272.377    0    4 813499.440 807272.377  0.77%     -    0s\n",
      "     0     0 807284.557    0    1 813499.440 807284.557  0.76%     -    0s\n",
      "     0     0 807298.666    0    2 813499.440 807298.666  0.76%     -    0s\n",
      "     0     0 807305.559    0    6 813499.440 807305.559  0.76%     -    0s\n",
      "H    0     0                    812223.58825 807305.559  0.61%     -    0s\n",
      "     0     0 807309.503    0    4 812223.588 807309.503  0.61%     -    0s\n",
      "     0     0 807339.469    0    4 812223.588 807339.469  0.60%     -    0s\n",
      "     0     0 807344.135    0    6 812223.588 807344.135  0.60%     -    0s\n",
      "     0     0 807359.565    0    7 812223.588 807359.565  0.60%     -    0s\n",
      "     0     0 807371.997    0    8 812223.588 807371.997  0.60%     -    0s\n",
      "     0     0 807372.245    0    8 812223.588 807372.245  0.60%     -    0s\n",
      "     0     0 807378.545    0    9 812223.588 807378.545  0.60%     -    0s\n",
      "     0     0 807378.545    0    9 812223.588 807378.545  0.60%     -    0s\n",
      "H    0     0                    811628.30751 807378.545  0.52%     -    0s\n",
      "H    0     0                    810280.45754 807378.545  0.36%     -    0s\n",
      "     0     0 807378.545    0    1 810280.458 807378.545  0.36%     -    0s\n",
      "H    0     0                    810123.10116 807378.545  0.34%     -    0s\n",
      "     0     0 807378.545    0    1 810123.101 807378.545  0.34%     -    0s\n",
      "     0     0 807378.545    0    3 810123.101 807378.545  0.34%     -    0s\n",
      "     0     0 807378.545    0    7 810123.101 807378.545  0.34%     -    0s\n",
      "     0     0 807379.672    0    8 810123.101 807379.672  0.34%     -    0s\n",
      "     0     0 807379.905    0    9 810123.101 807379.905  0.34%     -    0s\n",
      "     0     0 807380.615    0   10 810123.101 807380.615  0.34%     -    0s\n",
      "     0     0 807402.384    0   10 810123.101 807402.384  0.34%     -    0s\n",
      "     0     0 807407.299    0   12 810123.101 807407.299  0.34%     -    0s\n",
      "     0     0 807407.299    0   12 810123.101 807407.299  0.34%     -    0s\n",
      "     0     2 807408.320    0   12 810123.101 807408.320  0.34%     -    0s\n",
      "H    3     3                    809647.65837 807476.463  0.27%   3.0    0s\n",
      "H   84    35                    808870.26352 807568.065  0.16%   2.7    0s\n",
      "H   99    29                    808536.49552 807588.561  0.12%   2.7    0s\n",
      "*  310     1               5    808091.02482 808069.217  0.00%   3.3    0s\n",
      "\n",
      "Cutting planes:\n",
      "  Gomory: 3\n",
      "  Cover: 7\n",
      "  MIR: 9\n",
      "  Flow cover: 3\n",
      "\n",
      "Explored 311 nodes (1175 simplex iterations) in 0.06 seconds\n",
      "Thread count was 1 (of 32 available processors)\n",
      "\n",
      "Solution count 10: 808091 808536 808870 ... 818273\n",
      "\n",
      "Optimal solution found (tolerance 1.00e-04)\n",
      "Best objective 8.080910248225e+05, best bound 8.080692169045e+05, gap 0.0027%\n",
      "\n",
      "User-callback calls 832, time in user-callback 0.00 sec\n",
      "\n"
     ]
    }
   ],
   "source": [
    "solver_baseline = LearningSolver(gurobi)\n",
    "solve!(solver_baseline, test_files[1], build_uc_model, tee=true);"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7b5ce528",
   "metadata": {},
   "source": [
    "In the log above, the `MIP start` line is missing, and Gurobi had to start with a significantly inferior initial solution. The solver was still able to find the optimal solution at the end, but it required using its own internal heuristic procedures. In this example, because we solve very small optimization problems, there was almost no difference in terms of running time. For larger problems, however, the difference can be significant. See benchmarks for more details.\n",
    "\n",
    "<div class=\"alert alert-info\">\n",
    "Note\n",
    "    \n",
    "In addition to partial initial solutions, MIPLearn is also able to predict lazy constraints, cutting planes and branching priorities. See the next tutorials for more details.\n",
    "</div>\n",
    "\n",
    "<div class=\"alert alert-info\">\n",
    "Note\n",
    "    \n",
    "It is not necessary to specify what ML models to use. MIPLearn, by default, will try a number of classical ML models and will choose the one that performs the best, based on k-fold cross validation. MIPLearn is also able to automatically collect features based on the MIP formulation of the problem and the solution to the LP relaxation, among other things, so it does not require handcrafted features. If you do want to customize the models and features, however, that is also possible, as we will see in a later tutorial.\n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "46da094b",
   "metadata": {},
   "source": [
    "## Accessing the solution\n",
    "\n",
    "In the example above, we used `MIPLearn.solve` together with data files to solve both the training and the test instances. The optimal solutions were saved to HDF5 files in the train/test folders, and could be retrieved by reading theses files, but that is not very convenient. In the following example, we show how to build and solve a JuMP model entirely in-memory, using our trained solver."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "986f0c18",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "obj = 809710.340270503\n",
      "  x = [1.0, -0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0]\n",
      "  y = [696.38, 0.0, 249.05, 0.0, 1183.75, 0.0, 504.91, 387.32, 1178.0, 765.25]\n"
     ]
    }
   ],
   "source": [
    "# Construct model using previously defined functions\n",
    "data = random_uc_data(samples=1, n=50)[1]\n",
    "model = build_uc_model(data)\n",
    "\n",
    "# Solve model\n",
    "solve!(solver_ml, model)\n",
    "\n",
    "# Print part of the optimal solution\n",
    "println(\"obj = \", objective_value(model))\n",
    "println(\"  x = \", round.(value.(model[:x][1:10])))\n",
    "println(\"  y = \", round.(value.(model[:y][1:10]), digits=2))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f43ed281",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Julia 1.6.2",
   "language": "julia",
   "name": "julia-1.6"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "1.6.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
 }
--- a/docs/jump-tutorials/lazy-constraints.ipynb
+++ b/docs/jump-tutorials/lazy-constraints.ipynb
@@ -1,29 +0,0 @@
 {
 "cells": [
  {
   "cell_type": "markdown",
   "id": "18dd2957",
   "metadata": {},
   "source": [
    "# Modeling lazy constraints\n",
    "\n",
    "TODO"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Julia 1.6.0",
   "language": "julia",
   "name": "julia-1.6"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "1.6.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
 }
--- a/docs/jump-tutorials/user-cuts.ipynb
+++ b/docs/jump-tutorials/user-cuts.ipynb
@@ -1,29 +0,0 @@
 {
 "cells": [
  {
   "cell_type": "markdown",
   "id": "8e6b5f28",
   "metadata": {},
   "source": [
    "# Modeling user cuts\n",
    "\n",
    "TODO"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Julia 1.6.0",
   "language": "julia",
   "name": "julia-1.6"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "1.6.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
 }
--- a/docs/make.bat
+++ b/docs/make.bat
@@ -0,0 +1,35 @@
@ECHO OFF
 pushd %~dp0
 REM Command file for Sphinx documentation
 if "%SPHINXBUILD%" == "" (
 	set SPHINXBUILD=sphinx-build
 )
 set SOURCEDIR=.
 set BUILDDIR=_build
 %SPHINXBUILD% >NUL 2>NUL
 if errorlevel 9009 (
 	echo.
 	echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
 	echo.installed, then set the SPHINXBUILD environment variable to point
 	echo.to the full path of the 'sphinx-build' executable. Alternatively you
 	echo.may add the Sphinx directory to PATH.
 	echo.
 	echo.If you don't have Sphinx installed, grab it from
 	echo.https://www.sphinx-doc.org/
 	exit /b 1
 )
 if "%1" == "" goto help
 %SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
 goto end
 :help
 %SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
 :end
 popd
--- a/docs/pyomo-tutorials/getting-started.ipynb
+++ b/docs/pyomo-tutorials/getting-started.ipynb
@@ -1,625 +0,0 @@
 {
 "cells": [
  {
   "cell_type": "markdown",
   "id": "6b8983b1",
   "metadata": {
    "tags": []
   },
   "source": [
    "# Getting started\n",
    "\n",
    "## Introduction\n",
    "\n",
    "**MIPLearn** is an open source framework that uses machine learning (ML) to accelerate the performance of both commercial and open source mixed-integer programming solvers (e.g. Gurobi, CPLEX, XPRESS, Cbc or SCIP). In this tutorial, we will:\n",
    "\n",
    "1. Install the Python/Pyomo version of MIPLearn\n",
    "2. Model a simple optimization problem using JuMP\n",
    "3. Generate training data and train the ML models\n",
    "4. Use the ML models together Gurobi to solve new instances\n",
    "\n",
    "<div class=\"alert alert-info\">\n",
    "Note\n",
    "    \n",
    "The Python/Pyomo version of MIPLearn is currently only compatible with with Gurobi, CPLEX and XPRESS. For broader solver compatibility, see the Julia/JuMP version of the package.\n",
    "</div>\n",
    "\n",
    "<div class=\"alert alert-warning\">\n",
    "Warning\n",
    "    \n",
    "MIPLearn is still in early development stage. If run into any bugs or issues, please submit a bug report in our GitHub repository. Comments, suggestions and pull requests are also very welcome!\n",
    "    \n",
    "</div>\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "02f0a927",
   "metadata": {},
   "source": [
    "## Installation\n",
    "\n",
    "MIPLearn is available in two versions:\n",
    "\n",
    "- Python version, compatible with the Pyomo modeling language,\n",
    "- Julia version, compatible with the JuMP modeling language.\n",
    "\n",
    "In this tutorial, we will demonstrate how to use and install the Python/Pyomo version of the package. The first step is to install Python 3.8+ in your computer. See the [official Python website for more instructions](https://www.python.org/downloads/). After Python is installed, we proceed to install MIPLearn using `pip`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "cd8a69c1",
   "metadata": {},
   "outputs": [],
   "source": [
    "# !pip install MIPLearn==0.2.0.dev13"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e8274543",
   "metadata": {},
   "source": [
    "In addition to MIPLearn itself, we will also install Gurobi 9.5, a state-of-the-art commercial MILP solver. This step also install a demo license for Gurobi, which should able to solve the small optimization problems in this tutorial. A paid license is required for solving large-scale problems."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "dcc8756c",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Looking in indexes: https://pypi.gurobi.com\n",
      "Requirement already satisfied: gurobipy<9.6,>=9.5 in /opt/anaconda3/envs/miplearn/lib/python3.8/site-packages (9.5.1)\n"
     ]
    }
   ],
   "source": [
    "!pip install --upgrade -i https://pypi.gurobi.com 'gurobipy>=9.5,<9.6'"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a14e4550",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-info\">\n",
    "    \n",
    "Note\n",
    "    \n",
    "In the code above, we install specific version of all packages to ensure that this tutorial keeps running in the future, even when newer (and possibly incompatible) versions of the packages are released. This is usually a recommended practice for all Python projects.\n",
    "    \n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "16b86823",
   "metadata": {},
   "source": [
    "## Modeling a simple optimization problem\n",
    "\n",
    "To illustrate how can MIPLearn be used, we will model and solve a small optimization problem related to power systems optimization. The problem we discuss below is a simplification of the **unit commitment problem,** a practical optimization problem solved daily by electric grid operators around the world. \n",
    "\n",
    "Suppose that you work at a utility company, and that it is your job to decide which electrical generators should be online at a certain hour of the day, as well as how much power should each generator produce. More specifically, assume that your company owns $n$ generators, denoted by $g_1, \\ldots, g_n$. Each generator can either be online or offline. An online generator $g_i$ can produce between $p^\\text{min}_i$ to $p^\\text{max}_i$ megawatts of power, and it costs your company $c^\\text{fix}_i + c^\\text{var}_i y_i$, where $y_i$ is the amount of power produced. An offline generator produces nothing and costs nothing. You also know that the total amount of power to be produced needs to be exactly equal to the total demand $d$ (in megawatts). To minimize the costs to your company, which generators should be online, and how much power should they produce?\n",
    "\n",
    "This simple problem can be modeled as a *mixed-integer linear optimization* problem as follows. For each generator $g_i$, let $x_i \\in \\{0,1\\}$ be a decision variable indicating whether $g_i$ is online, and let $y_i \\geq 0$ be a decision variable indicating how much power does $g_i$ produce. The problem is then given by:\n",
    "\n",
    "$$\n",
    "\\begin{align}\n",
    "\\text{minimize } \\quad & \\sum_{i=1}^n \\left( c^\\text{fix}_i x_i + c^\\text{var}_i y_i \\right) \\\\\n",
    "\\text{subject to } \\quad & y_i \\leq p^\\text{max}_i x_i & i=1,\\ldots,n \\\\\n",
    "& y_i \\geq p^\\text{min}_i x_i & i=1,\\ldots,n \\\\\n",
    "& \\sum_{i=1}^n y_i = d \\\\\n",
    "& x_i \\in \\{0,1\\} & i=1,\\ldots,n \\\\\n",
    "& y_i \\geq 0 & i=1,\\ldots,n\n",
    "\\end{align}\n",
    "$$\n",
    "\n",
    "<div class=\"alert alert-info\">\n",
    "    \n",
    "Note\n",
    "    \n",
    "We use a simplified version of the unit commitment problem in this tutorial just to make it easier to follow. MIPLearn can also handle realistic, large-scale versions of this problem. See benchmarks for more details.\n",
    "    \n",
    "</div>\n",
    "\n",
    "Next, let us convert this abstract mathematical formulation into a concrete optimization model, using Python and Pyomo. We start by defining a data class `UnitCommitmentData`, which holds all the input data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "22a67170-10b4-43d3-8708-014d91141e73",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from dataclasses import dataclass\n",
    "import numpy as np\n",
    "\n",
    "@dataclass\n",
    "class UnitCommitmentData:\n",
    "    demand: float\n",
    "    pmin: np.ndarray\n",
    "    pmax: np.ndarray\n",
    "    cfix: np.ndarray\n",
    "    cvar: np.ndarray"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "29f55efa-0751-465a-9b0a-a821d46a3d40",
   "metadata": {},
   "source": [
    "Next, we write a `build_uc_model` function, which converts the input data into a concrete Pyomo model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "2f67032f-0d74-4317-b45c-19da0ec859e9",
   "metadata": {},
   "outputs": [],
   "source": [
    "import pyomo.environ as pe\n",
    "\n",
    "def build_uc_model(data: UnitCommitmentData) -> pe.ConcreteModel:\n",
    "    model = pe.ConcreteModel()\n",
    "    n = len(data.pmin)\n",
    "    model.x = pe.Var(range(n), domain=pe.Binary)\n",
    "    model.y = pe.Var(range(n), domain=pe.NonNegativeReals)\n",
    "    model.obj = pe.Objective(\n",
    "        expr=sum(\n",
    "            data.cfix[i] * model.x[i] +\n",
    "            data.cvar[i] * model.y[i]\n",
    "            for i in range(n)\n",
    "        )\n",
    "    )\n",
    "    model.eq_max_power = pe.ConstraintList()\n",
    "    model.eq_min_power = pe.ConstraintList()\n",
    "    for i in range(n):\n",
    "        model.eq_max_power.add(model.y[i] <= data.pmax[i] * model.x[i])\n",
    "        model.eq_min_power.add(model.y[i] >= data.pmin[i] * model.x[i])\n",
    "    model.eq_demand = pe.Constraint(\n",
    "        expr=sum(model.y[i] for i in range(n)) == data.demand,\n",
    "    )\n",
    "    return model"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c22714a3",
   "metadata": {},
   "source": [
    "At this point, we can already use Pyomo and any mixed-integer linear programming solver to find optimal solutions to any instance of this problem. To illustrate this, let us solve a small instance with three generators:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "2a896f47",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Set parameter Threads to value 1\n",
      "Set parameter Seed to value 42\n",
      "Restricted license - for non-production use only - expires 2023-10-25\n",
      "obj = 1320.0\n",
      "x = [-0.0, 1.0, 1.0]\n",
      "y = [0.0, 60.0, 40.0]\n"
     ]
    }
   ],
   "source": [
    "model = build_uc_model(\n",
    "    UnitCommitmentData(\n",
    "        demand = 100.0,\n",
    "        pmin = [10, 20, 30],\n",
    "        pmax = [50, 60, 70],\n",
    "        cfix = [700, 600, 500],\n",
    "        cvar = [1.5, 2.0, 2.5],\n",
    "    )\n",
    ")\n",
    "\n",
    "solver = pe.SolverFactory(\"gurobi_persistent\")\n",
    "solver.set_instance(model)\n",
    "solver.solve()\n",
    "print(\"obj =\", model.obj())\n",
    "print(\"x =\", [model.x[i].value for i in range(3)])\n",
    "print(\"y =\", [model.y[i].value for i in range(3)])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "41b03bbc",
   "metadata": {},
   "source": [
    "Running the code above, we found that the optimal solution for our small problem instance costs \\$1320. It is achieve by keeping generators 2 and 3 online and producing, respectively, 60 MW and 40 MW of power."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cf60c1dd",
   "metadata": {},
   "source": [
    "## Generating training data\n",
    "\n",
    "Although Gurobi could solve the small example above in a fraction of a second, it gets slower for larger and more complex versions of the problem. If this is a problem that needs to be solved frequently, as it is often the case in practice, it could make sense to spend some time upfront generating a **trained** version of Gurobi, which can solve new instances (similar to the ones it was trained on) faster.\n",
    "\n",
    "In the following, we will use MIPLearn to train machine learning models that is able to predict the optimal solution for instances that follow a given probability distribution, then it will provide this predicted solution to Gurobi as a warm start. Before we can train the model, we need to collect training data by solving a large number of instances. In real-world situations, we may construct these training instances based on historical data. In this tutorial, we will construct them using a random instance generator:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "5eb09fab",
   "metadata": {},
   "outputs": [],
   "source": [
    "from scipy.stats import uniform\n",
    "from typing import List\n",
    "import random\n",
    "\n",
    "def random_uc_data(samples: int, n: int, seed: int = 42) -> List[UnitCommitmentData]:\n",
    "    random.seed(seed)\n",
    "    np.random.seed(seed)\n",
    "    pmin = uniform(loc=100_000.0, scale=400_000.0).rvs(n)\n",
    "    pmax = pmin * uniform(loc=2.0, scale=2.5).rvs(n)\n",
    "    cfix = pmin * uniform(loc=100.0, scale=25.0).rvs(n)\n",
    "    cvar = uniform(loc=1.25, scale=0.25).rvs(n)\n",
    "    return [\n",
    "        UnitCommitmentData(\n",
    "            demand = pmax.sum() * uniform(loc=0.5, scale=0.25).rvs(),\n",
    "            pmin = pmin,\n",
    "            pmax = pmax,\n",
    "            cfix = cfix,\n",
    "            cvar = cvar,\n",
    "        )\n",
    "        for i in range(samples)\n",
    "    ]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3a03a7ac",
   "metadata": {},
   "source": [
    "In this example, for simplicity, only the demands change from one instance to the next. We could also have randomized the costs, production limits or even the number of units. The more randomization we have in the training data, however, the more challenging it is for the machine learning models to learn solution patterns.\n",
    "\n",
    "Now we generate 500 instances of this problem, each one with 50 generators, and we use 450 of these instances for training. After generating the instances, we write them to individual files. MIPLearn uses files during the training process because, for large-scale optimization problems, it is often impractical to hold in memory the entire training data, as well as the concrete Pyomo models. Files also make it much easier to solve multiple instances simultaneously, potentially even on multiple machines. We will cover parallel and distributed computing in a future tutorial. The code below generates the files `uc/train/00000.pkl.gz`, `uc/train/00001.pkl.gz`, etc., which contain the input data in compressed (gzipped) pickle format."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "6156752c",
   "metadata": {},
   "outputs": [],
   "source": [
    "from miplearn import save\n",
    "data = random_uc_data(samples=500, n=50)\n",
    "train_files = save(data[0:450], \"uc/train/\")\n",
    "test_files  = save(data[450:500], \"uc/test/\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b17af877",
   "metadata": {},
   "source": [
    "Finally, we use `LearningSolver` to solve all the training instances. `LearningSolver` is the main component provided by MIPLearn, which integrates MIP solvers and ML. The optimal solutions, along with other useful training data, are stored in HDF5 files `uc/train/00000.h5`, `uc/train/00001.h5`, etc."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "7623f002",
   "metadata": {},
   "outputs": [],
   "source": [
    "from miplearn import LearningSolver\n",
    "solver = LearningSolver()\n",
    "solver.solve(train_files, build_uc_model);"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2f24ee83",
   "metadata": {},
   "source": [
    "## Solving test instances\n",
    "\n",
    "With training data in hand, we can now fit the ML models, using the `LearningSolver.fit` method, then solve the test instances with `LearningSolver.solve`, as shown below. The `tee=True` parameter asks MIPLearn to print the solver log to the screen."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "c8385030",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Set parameter LogFile to value \"/tmp/tmpvbaqbyty.log\"\n",
      "Set parameter QCPDual to value 1\n",
      "Gurobi Optimizer version 9.5.1 build v9.5.1rc2 (linux64)\n",
      "Thread count: 16 physical cores, 32 logical processors, using up to 1 threads\n",
      "Optimize a model with 101 rows, 100 columns and 250 nonzeros\n",
      "Model fingerprint: 0x8de73876\n",
      "Coefficient statistics:\n",
      "  Matrix range     [1e+00, 2e+06]\n",
      "  Objective range  [1e+00, 6e+07]\n",
      "  Bounds range     [1e+00, 1e+00]\n",
      "  RHS range        [2e+07, 2e+07]\n",
      "Presolve removed 100 rows and 50 columns\n",
      "Presolve time: 0.00s\n",
      "Presolved: 1 rows, 50 columns, 50 nonzeros\n",
      "\n",
      "Iteration    Objective       Primal Inf.    Dual Inf.      Time\n",
      "       0    5.7349081e+08   1.044003e+04   0.000000e+00      0s\n",
      "       1    6.8268465e+08   0.000000e+00   0.000000e+00      0s\n",
      "\n",
      "Solved in 1 iterations and 0.00 seconds (0.00 work units)\n",
      "Optimal objective  6.826846503e+08\n",
      "Set parameter LogFile to value \"\"\n",
      "Set parameter LogFile to value \"/tmp/tmp48j6n35b.log\"\n",
      "Gurobi Optimizer version 9.5.1 build v9.5.1rc2 (linux64)\n",
      "Thread count: 16 physical cores, 32 logical processors, using up to 1 threads\n",
      "Optimize a model with 101 rows, 100 columns and 250 nonzeros\n",
      "Model fingerprint: 0x200d64ba\n",
      "Variable types: 50 continuous, 50 integer (50 binary)\n",
      "Coefficient statistics:\n",
      "  Matrix range     [1e+00, 2e+06]\n",
      "  Objective range  [1e+00, 6e+07]\n",
      "  Bounds range     [1e+00, 1e+00]\n",
      "  RHS range        [2e+07, 2e+07]\n",
      "\n",
      "User MIP start produced solution with objective 6.84841e+08 (0.00s)\n",
      "Loaded user MIP start with objective 6.84841e+08\n",
      "\n",
      "Presolve time: 0.00s\n",
      "Presolved: 101 rows, 100 columns, 250 nonzeros\n",
      "Variable types: 50 continuous, 50 integer (50 binary)\n",
      "\n",
      "Root relaxation: objective 6.826847e+08, 56 iterations, 0.00 seconds (0.00 work units)\n",
      "\n",
      "    Nodes    |    Current Node    |     Objective Bounds      |     Work\n",
      " Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time\n",
      "\n",
      "     0     0 6.8268e+08    0    1 6.8484e+08 6.8268e+08  0.31%     -    0s\n",
      "     0     0 6.8315e+08    0    3 6.8484e+08 6.8315e+08  0.25%     -    0s\n",
      "     0     0 6.8315e+08    0    1 6.8484e+08 6.8315e+08  0.25%     -    0s\n",
      "     0     0 6.8315e+08    0    3 6.8484e+08 6.8315e+08  0.25%     -    0s\n",
      "     0     0 6.8315e+08    0    4 6.8484e+08 6.8315e+08  0.25%     -    0s\n",
      "     0     0 6.8315e+08    0    4 6.8484e+08 6.8315e+08  0.25%     -    0s\n",
      "     0     2 6.8327e+08    0    4 6.8484e+08 6.8327e+08  0.23%     -    0s\n",
      "\n",
      "Cutting planes:\n",
      "  Flow cover: 3\n",
      "\n",
      "Explored 32 nodes (155 simplex iterations) in 0.02 seconds (0.00 work units)\n",
      "Thread count was 1 (of 32 available processors)\n",
      "\n",
      "Solution count 1: 6.84841e+08 \n",
      "\n",
      "Optimal solution found (tolerance 1.00e-04)\n",
      "Best objective 6.848411655488e+08, best bound 6.848411655488e+08, gap 0.0000%\n",
      "Set parameter LogFile to value \"\"\n",
      "WARNING: Cannot get reduced costs for MIP.\n",
      "WARNING: Cannot get duals for MIP.\n"
     ]
    }
   ],
   "source": [
    "solver_ml = LearningSolver()\n",
    "solver_ml.fit(train_files, build_uc_model)\n",
    "solver_ml.solve(test_files[0:1], build_uc_model, tee=True);"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "61da6dad-7f56-4edb-aa26-c00eb5f946c0",
   "metadata": {},
   "source": [
    "By examining the solve log above, specifically the line `Loaded user MIP start with objective...`, we can see that MIPLearn was able to construct an initial solution which turned out to be the optimal solution to the problem. Now let us repeat the code above, but using an untrained solver. Note that the `fit` line is omitted."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "33d15d6c-6db4-477f-bd4b-fe8e84e5f023",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Set parameter LogFile to value \"/tmp/tmp3uhhdurw.log\"\n",
      "Set parameter QCPDual to value 1\n",
      "Gurobi Optimizer version 9.5.1 build v9.5.1rc2 (linux64)\n",
      "Thread count: 16 physical cores, 32 logical processors, using up to 1 threads\n",
      "Optimize a model with 101 rows, 100 columns and 250 nonzeros\n",
      "Model fingerprint: 0x8de73876\n",
      "Coefficient statistics:\n",
      "  Matrix range     [1e+00, 2e+06]\n",
      "  Objective range  [1e+00, 6e+07]\n",
      "  Bounds range     [1e+00, 1e+00]\n",
      "  RHS range        [2e+07, 2e+07]\n",
      "Presolve removed 100 rows and 50 columns\n",
      "Presolve time: 0.00s\n",
      "Presolved: 1 rows, 50 columns, 50 nonzeros\n",
      "\n",
      "Iteration    Objective       Primal Inf.    Dual Inf.      Time\n",
      "       0    5.7349081e+08   1.044003e+04   0.000000e+00      0s\n",
      "       1    6.8268465e+08   0.000000e+00   0.000000e+00      0s\n",
      "\n",
      "Solved in 1 iterations and 0.01 seconds (0.00 work units)\n",
      "Optimal objective  6.826846503e+08\n",
      "Set parameter LogFile to value \"\"\n",
      "Set parameter LogFile to value \"/tmp/tmp18aqg2ic.log\"\n",
      "Gurobi Optimizer version 9.5.1 build v9.5.1rc2 (linux64)\n",
      "Thread count: 16 physical cores, 32 logical processors, using up to 1 threads\n",
      "Optimize a model with 101 rows, 100 columns and 250 nonzeros\n",
      "Model fingerprint: 0xb90d1075\n",
      "Variable types: 50 continuous, 50 integer (50 binary)\n",
      "Coefficient statistics:\n",
      "  Matrix range     [1e+00, 2e+06]\n",
      "  Objective range  [1e+00, 6e+07]\n",
      "  Bounds range     [1e+00, 1e+00]\n",
      "  RHS range        [2e+07, 2e+07]\n",
      "Found heuristic solution: objective 8.056576e+08\n",
      "Presolve time: 0.00s\n",
      "Presolved: 101 rows, 100 columns, 250 nonzeros\n",
      "Variable types: 50 continuous, 50 integer (50 binary)\n",
      "\n",
      "Root relaxation: objective 6.826847e+08, 56 iterations, 0.00 seconds (0.00 work units)\n",
      "\n",
      "    Nodes    |    Current Node    |     Objective Bounds      |     Work\n",
      " Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time\n",
      "\n",
      "     0     0 6.8268e+08    0    1 8.0566e+08 6.8268e+08  15.3%     -    0s\n",
      "H    0     0                    7.099498e+08 6.8268e+08  3.84%     -    0s\n",
      "     0     0 6.8315e+08    0    3 7.0995e+08 6.8315e+08  3.78%     -    0s\n",
      "H    0     0                    6.883227e+08 6.8315e+08  0.75%     -    0s\n",
      "     0     0 6.8352e+08    0    4 6.8832e+08 6.8352e+08  0.70%     -    0s\n",
      "     0     0 6.8352e+08    0    4 6.8832e+08 6.8352e+08  0.70%     -    0s\n",
      "     0     0 6.8352e+08    0    1 6.8832e+08 6.8352e+08  0.70%     -    0s\n",
      "H    0     0                    6.862582e+08 6.8352e+08  0.40%     -    0s\n",
      "     0     0 6.8352e+08    0    4 6.8626e+08 6.8352e+08  0.40%     -    0s\n",
      "     0     0 6.8352e+08    0    4 6.8626e+08 6.8352e+08  0.40%     -    0s\n",
      "     0     0 6.8352e+08    0    1 6.8626e+08 6.8352e+08  0.40%     -    0s\n",
      "     0     0 6.8352e+08    0    3 6.8626e+08 6.8352e+08  0.40%     -    0s\n",
      "     0     0 6.8352e+08    0    4 6.8626e+08 6.8352e+08  0.40%     -    0s\n",
      "     0     0 6.8352e+08    0    4 6.8626e+08 6.8352e+08  0.40%     -    0s\n",
      "     0     2 6.8354e+08    0    4 6.8626e+08 6.8354e+08  0.40%     -    0s\n",
      "*   18     5               6    6.849018e+08 6.8413e+08  0.11%   3.1    0s\n",
      "H   24     1                    6.848412e+08 6.8426e+08  0.09%   3.2    0s\n",
      "\n",
      "Cutting planes:\n",
      "  Gomory: 1\n",
      "  Flow cover: 2\n",
      "\n",
      "Explored 30 nodes (217 simplex iterations) in 0.02 seconds (0.00 work units)\n",
      "Thread count was 1 (of 32 available processors)\n",
      "\n",
      "Solution count 6: 6.84841e+08 6.84902e+08 6.86258e+08 ... 8.05658e+08\n",
      "\n",
      "Optimal solution found (tolerance 1.00e-04)\n",
      "Best objective 6.848411655488e+08, best bound 6.848411655488e+08, gap 0.0000%\n",
      "Set parameter LogFile to value \"\"\n",
      "WARNING: Cannot get reduced costs for MIP.\n",
      "WARNING: Cannot get duals for MIP.\n"
     ]
    }
   ],
   "source": [
    "solver_baseline = LearningSolver()\n",
    "solver_baseline.solve(test_files[0:1], build_uc_model, tee=True);"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b6d37b88-9fcc-43ee-ac1e-2a7b1e51a266",
   "metadata": {},
   "source": [
    "In the log above, the `MIP start` line is missing, and Gurobi had to start with a significantly inferior initial solution. The solver was still able to find the optimal solution at the end, but it required using its own internal heuristic procedures. In this example, because we solve very small optimization problems, there was almost no difference in terms of running time. For larger problems, however, the difference can be significant. See benchmarks for more details.\n",
    "\n",
    "<div class=\"alert alert-info\">\n",
    "Note\n",
    "    \n",
    "In addition to partial initial solutions, MIPLearn is also able to predict lazy constraints, cutting planes and branching priorities. See the next tutorials for more details.\n",
    "</div>\n",
    "\n",
    "<div class=\"alert alert-info\">\n",
    "Note\n",
    "    \n",
    "It is not necessary to specify what ML models to use. MIPLearn, by default, will try a number of classical ML models and will choose the one that performs the best, based on k-fold cross validation. MIPLearn is also able to automatically collect features based on the MIP formulation of the problem and the solution to the LP relaxation, among other things, so it does not require handcrafted features. If you do want to customize the models and features, however, that is also possible, as we will see in a later tutorial.\n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eec97f06",
   "metadata": {
    "tags": []
   },
   "source": [
    "## Accessing the solution\n",
    "\n",
    "In the example above, we used `LearningSolver.solve` together with data files to solve both the training and the test instances. The optimal solutions were saved to HDF5 files in the train/test folders, and could be retrieved by reading theses files, but that is not very convenient. In the following example, we show how to build and solve a Pyomo model entirely in-memory, using our trained solver."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "67a6cd18",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "obj = 903865807.3536932\n",
      " x = [1.0, 1.0, 1.0, 1.0, 1.0]\n",
      " y = [1105176.593734543, 1891284.5155055337, 1708177.4224033852, 1438329.610189608, 535496.3347187206]\n"
     ]
    }
   ],
   "source": [
    "# Construct model using previously defined functions\n",
    "data = random_uc_data(samples=1, n=50)[0]\n",
    "model = build_uc_model(data)\n",
    "\n",
    "# Solve model using ML + Gurobi\n",
    "solver_ml.solve(model)\n",
    "\n",
    "# Print part of the optimal solution\n",
    "print(\"obj =\", model.obj())\n",
    "print(\" x =\", [model.x[i].value for i in range(5)])\n",
    "print(\" y =\", [model.y[i].value for i in range(5)])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5593d23a-83bd-4e16-8253-6300f5e3f63b",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
 }
--- a/docs/pyomo-tutorials/gurobi.env
+++ b/docs/pyomo-tutorials/gurobi.env
@@ -1,3 +0,0 @@
 OutputFlag 1
 Threads 1
 Seed 42
--- a/docs/tutorials/Manifest.toml
+++ b/docs/tutorials/Manifest.toml
@@ -0,0 +1,637 @@
 # This file is machine-generated - editing it directly is not advised
 julia_version = "1.9.0"
 manifest_format = "2.0"
 project_hash = "acf9261f767ae18f2b4613fd5590ea6a33f31e10"
 [[deps.ArgTools]]
 uuid = "0dad84c5-d112-42e6-8d28-ef12dabb789f"
 version = "1.1.1"
 [[deps.Artifacts]]
 uuid = "56f22d72-fd6d-98f1-02f0-08ddc0907c33"
 [[deps.Base64]]
 uuid = "2a0f44e3-6c83-55bd-87e4-b1978d98bd5f"
 [[deps.BenchmarkTools]]
 deps = ["JSON", "Logging", "Printf", "Profile", "Statistics", "UUIDs"]
 git-tree-sha1 = "d9a9701b899b30332bbcb3e1679c41cce81fb0e8"
 uuid = "6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf"
 version = "1.3.2"
 [[deps.Bzip2_jll]]
 deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"]
 git-tree-sha1 = "19a35467a82e236ff51bc17a3a44b69ef35185a2"
 uuid = "6e34b625-4abd-537c-b88f-471c36dfa7a0"
 version = "1.0.8+0"
 [[deps.Calculus]]
 deps = ["LinearAlgebra"]
 git-tree-sha1 = "f641eb0a4f00c343bbc32346e1217b86f3ce9dad"
 uuid = "49dc2e85-a5d0-5ad3-a950-438e2897f1b9"
 version = "0.5.1"
 [[deps.CodecBzip2]]
 deps = ["Bzip2_jll", "Libdl", "TranscodingStreams"]
 git-tree-sha1 = "2e62a725210ce3c3c2e1a3080190e7ca491f18d7"
 uuid = "523fee87-0ab8-5b00-afb7-3ecf72e48cfd"
 version = "0.7.2"
 [[deps.CodecZlib]]
 deps = ["TranscodingStreams", "Zlib_jll"]
 git-tree-sha1 = "9c209fb7536406834aa938fb149964b985de6c83"
 uuid = "944b1d66-785c-5afd-91f1-9de20f533193"
 version = "0.7.1"
 [[deps.CommonSubexpressions]]
 deps = ["MacroTools", "Test"]
 git-tree-sha1 = "7b8a93dba8af7e3b42fecabf646260105ac373f7"
 uuid = "bbf7d656-a473-5ed7-a52c-81e309532950"
 version = "0.3.0"
 [[deps.Compat]]
 deps = ["UUIDs"]
 git-tree-sha1 = "7a60c856b9fa189eb34f5f8a6f6b5529b7942957"
 uuid = "34da2185-b29b-5c13-b0c7-acf172513d20"
 version = "4.6.1"
 weakdeps = ["Dates", "LinearAlgebra"]
    [deps.Compat.extensions]
    CompatLinearAlgebraExt = "LinearAlgebra"
 [[deps.CompilerSupportLibraries_jll]]
 deps = ["Artifacts", "Libdl"]
 uuid = "e66e0078-7015-5450-92f7-15fbd957f2ae"
 version = "1.0.2+0"
 [[deps.Conda]]
 deps = ["Downloads", "JSON", "VersionParsing"]
 git-tree-sha1 = "e32a90da027ca45d84678b826fffd3110bb3fc90"
 uuid = "8f4d0f93-b110-5947-807f-2305c1781a2d"
 version = "1.8.0"
 [[deps.DataAPI]]
 git-tree-sha1 = "8da84edb865b0b5b0100c0666a9bc9a0b71c553c"
 uuid = "9a962f9c-6df0-11e9-0e5d-c546b8b5ee8a"
 version = "1.15.0"
 [[deps.DataStructures]]
 deps = ["Compat", "InteractiveUtils", "OrderedCollections"]
 git-tree-sha1 = "d1fff3a548102f48987a52a2e0d114fa97d730f0"
 uuid = "864edb3b-99cc-5e75-8d2d-829cb0a9cfe8"
 version = "0.18.13"
 [[deps.Dates]]
 deps = ["Printf"]
 uuid = "ade2ca70-3891-5945-98fb-dc099432e06a"
 [[deps.DiffResults]]
 deps = ["StaticArraysCore"]
 git-tree-sha1 = "782dd5f4561f5d267313f23853baaaa4c52ea621"
 uuid = "163ba53b-c6d8-5494-b064-1a9d43ac40c5"
 version = "1.1.0"
 [[deps.DiffRules]]
 deps = ["IrrationalConstants", "LogExpFunctions", "NaNMath", "Random", "SpecialFunctions"]
 git-tree-sha1 = "23163d55f885173722d1e4cf0f6110cdbaf7e272"
 uuid = "b552c78f-8df3-52c6-915a-8e097449b14b"
 version = "1.15.1"
 [[deps.Distributions]]
 deps = ["FillArrays", "LinearAlgebra", "PDMats", "Printf", "QuadGK", "Random", "SparseArrays", "SpecialFunctions", "Statistics", "StatsAPI", "StatsBase", "StatsFuns", "Test"]
 git-tree-sha1 = "c72970914c8a21b36bbc244e9df0ed1834a0360b"
 uuid = "31c24e10-a181-5473-b8eb-7969acd0382f"
 version = "0.25.95"
    [deps.Distributions.extensions]
    DistributionsChainRulesCoreExt = "ChainRulesCore"
    DistributionsDensityInterfaceExt = "DensityInterface"
    [deps.Distributions.weakdeps]
    ChainRulesCore = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4"
    DensityInterface = "b429d917-457f-4dbc-8f4c-0cc954292b1d"
 [[deps.DocStringExtensions]]
 deps = ["LibGit2"]
 git-tree-sha1 = "2fb1e02f2b635d0845df5d7c167fec4dd739b00d"
 uuid = "ffbed154-4ef7-542d-bbb7-c09d3a79fcae"
 version = "0.9.3"
 [[deps.Downloads]]
 deps = ["ArgTools", "FileWatching", "LibCURL", "NetworkOptions"]
 uuid = "f43a241f-c20a-4ad4-852c-f6b1247861c6"
 version = "1.6.0"
 [[deps.DualNumbers]]
 deps = ["Calculus", "NaNMath", "SpecialFunctions"]
 git-tree-sha1 = "5837a837389fccf076445fce071c8ddaea35a566"
 uuid = "fa6b7ba4-c1ee-5f82-b5fc-ecf0adba8f74"
 version = "0.6.8"
 [[deps.ExprTools]]
 git-tree-sha1 = "c1d06d129da9f55715c6c212866f5b1bddc5fa00"
 uuid = "e2ba6199-217a-4e67-a87a-7c52f15ade04"
 version = "0.1.9"
 [[deps.FileIO]]
 deps = ["Pkg", "Requires", "UUIDs"]
 git-tree-sha1 = "299dc33549f68299137e51e6d49a13b5b1da9673"
 uuid = "5789e2e9-d7fb-5bc7-8068-2c6fae9b9549"
 version = "1.16.1"
 [[deps.FileWatching]]
 uuid = "7b1f6079-737a-58dc-b8bc-7a2ca5c1b5ee"
 [[deps.FillArrays]]
 deps = ["LinearAlgebra", "Random", "SparseArrays", "Statistics"]
 git-tree-sha1 = "589d3d3bff204bdd80ecc53293896b4f39175723"
 uuid = "1a297f60-69ca-5386-bcde-b61e274b549b"
 version = "1.1.1"
 [[deps.ForwardDiff]]
 deps = ["CommonSubexpressions", "DiffResults", "DiffRules", "LinearAlgebra", "LogExpFunctions", "NaNMath", "Preferences", "Printf", "Random", "SpecialFunctions"]
 git-tree-sha1 = "00e252f4d706b3d55a8863432e742bf5717b498d"
 uuid = "f6369f11-7733-5829-9624-2563aa707210"
 version = "0.10.35"
    [deps.ForwardDiff.extensions]
    ForwardDiffStaticArraysExt = "StaticArrays"
    [deps.ForwardDiff.weakdeps]
    StaticArrays = "90137ffa-7385-5640-81b9-e52037218182"
 [[deps.Gurobi]]
 deps = ["LazyArtifacts", "Libdl", "MathOptInterface"]
 git-tree-sha1 = "22439b1c2bacb7d50ed0df7dbd10211e0b4cd379"
 uuid = "2e9cd046-0924-5485-92f1-d5272153d98b"
 version = "1.0.1"
 [[deps.HDF5]]
 deps = ["Compat", "HDF5_jll", "Libdl", "Mmap", "Random", "Requires", "UUIDs"]
 git-tree-sha1 = "c73fdc3d9da7700691848b78c61841274076932a"
 uuid = "f67ccb44-e63f-5c2f-98bd-6dc0ccc4ba2f"
 version = "0.16.15"
 [[deps.HDF5_jll]]
 deps = ["Artifacts", "CompilerSupportLibraries_jll", "JLLWrappers", "LLVMOpenMP_jll", "LazyArtifacts", "LibCURL_jll", "Libdl", "MPICH_jll", "MPIPreferences", "MPItrampoline_jll", "MicrosoftMPI_jll", "OpenMPI_jll", "OpenSSL_jll", "TOML", "Zlib_jll", "libaec_jll"]
 git-tree-sha1 = "3b20c3ce9c14aedd0adca2bc8c882927844bd53d"
 uuid = "0234f1f7-429e-5d53-9886-15a909be8d59"
 version = "1.14.0+0"
 [[deps.HiGHS]]
 deps = ["HiGHS_jll", "MathOptInterface", "PrecompileTools", "SparseArrays"]
 git-tree-sha1 = "bbd4ab443dfac4c9d5c5b40dd45f598dfad2e26a"
 uuid = "87dc4568-4c63-4d18-b0c0-bb2238e4078b"
 version = "1.5.2"
 [[deps.HiGHS_jll]]
 deps = ["Artifacts", "CompilerSupportLibraries_jll", "JLLWrappers", "Libdl"]
 git-tree-sha1 = "216e7198aeb256e7c7921ef2937d7e1e589ba6fd"
 uuid = "8fd58aa0-07eb-5a78-9b36-339c94fd15ea"
 version = "1.5.3+0"
 [[deps.HypergeometricFunctions]]
 deps = ["DualNumbers", "LinearAlgebra", "OpenLibm_jll", "SpecialFunctions"]
 git-tree-sha1 = "84204eae2dd237500835990bcade263e27674a93"
 uuid = "34004b35-14d8-5ef3-9330-4cdb6864b03a"
 version = "0.3.16"
 [[deps.InteractiveUtils]]
 deps = ["Markdown"]
 uuid = "b77e0a4c-d291-57a0-90e8-8db25a27a240"
 [[deps.IrrationalConstants]]
 git-tree-sha1 = "630b497eafcc20001bba38a4651b327dcfc491d2"
 uuid = "92d709cd-6900-40b7-9082-c6be49f344b6"
 version = "0.2.2"
 [[deps.JLD2]]
 deps = ["FileIO", "MacroTools", "Mmap", "OrderedCollections", "Pkg", "Printf", "Reexport", "Requires", "TranscodingStreams", "UUIDs"]
 git-tree-sha1 = "42c17b18ced77ff0be65957a591d34f4ed57c631"
 uuid = "033835bb-8acc-5ee8-8aae-3f567f8a3819"
 version = "0.4.31"
 [[deps.JLLWrappers]]
 deps = ["Preferences"]
 git-tree-sha1 = "abc9885a7ca2052a736a600f7fa66209f96506e1"
 uuid = "692b3bcd-3c85-4b1f-b108-f13ce0eb3210"
 version = "1.4.1"
 [[deps.JSON]]
 deps = ["Dates", "Mmap", "Parsers", "Unicode"]
 git-tree-sha1 = "31e996f0a15c7b280ba9f76636b3ff9e2ae58c9a"
 uuid = "682c06a0-de6a-54ab-a142-c8b1cf79cde6"
 version = "0.21.4"
 [[deps.JuMP]]
 deps = ["LinearAlgebra", "MathOptInterface", "MutableArithmetics", "OrderedCollections", "Printf", "SnoopPrecompile", "SparseArrays"]
 git-tree-sha1 = "3e4a73edf2ca1bfe97f1fc86eb4364f95ef0fccd"
 uuid = "4076af6c-e467-56ae-b986-b466b2749572"
 version = "1.11.1"
 [[deps.KLU]]
 deps = ["LinearAlgebra", "SparseArrays", "SuiteSparse_jll"]
 git-tree-sha1 = "764164ed65c30738750965d55652db9c94c59bfe"
 uuid = "ef3ab10e-7fda-4108-b977-705223b18434"
 version = "0.4.0"
 [[deps.LLVMOpenMP_jll]]
 deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"]
 git-tree-sha1 = "f689897ccbe049adb19a065c495e75f372ecd42b"
 uuid = "1d63c593-3942-5779-bab2-d838dc0a180e"
 version = "15.0.4+0"
 [[deps.LazyArtifacts]]
 deps = ["Artifacts", "Pkg"]
 uuid = "4af54fe1-eca0-43a8-85a7-787d91b784e3"
 [[deps.LibCURL]]
 deps = ["LibCURL_jll", "MozillaCACerts_jll"]
 uuid = "b27032c2-a3e7-50c8-80cd-2d36dbcbfd21"
 version = "0.6.3"
 [[deps.LibCURL_jll]]
 deps = ["Artifacts", "LibSSH2_jll", "Libdl", "MbedTLS_jll", "Zlib_jll", "nghttp2_jll"]
 uuid = "deac9b47-8bc7-5906-a0fe-35ac56dc84c0"
 version = "7.84.0+0"
 [[deps.LibGit2]]
 deps = ["Base64", "NetworkOptions", "Printf", "SHA"]
 uuid = "76f85450-5226-5b5a-8eaa-529ad045b433"
 [[deps.LibSSH2_jll]]
 deps = ["Artifacts", "Libdl", "MbedTLS_jll"]
 uuid = "29816b5a-b9ab-546f-933c-edad1886dfa8"
 version = "1.10.2+0"
 [[deps.Libdl]]
 uuid = "8f399da3-3557-5675-b5ff-fb832c97cbdb"
 [[deps.LinearAlgebra]]
 deps = ["Libdl", "OpenBLAS_jll", "libblastrampoline_jll"]
 uuid = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
 [[deps.LogExpFunctions]]
 deps = ["DocStringExtensions", "IrrationalConstants", "LinearAlgebra"]
 git-tree-sha1 = "c3ce8e7420b3a6e071e0fe4745f5d4300e37b13f"
 uuid = "2ab3a3ac-af41-5b50-aa03-7779005ae688"
 version = "0.3.24"
    [deps.LogExpFunctions.extensions]
    LogExpFunctionsChainRulesCoreExt = "ChainRulesCore"
    LogExpFunctionsChangesOfVariablesExt = "ChangesOfVariables"
    LogExpFunctionsInverseFunctionsExt = "InverseFunctions"
    [deps.LogExpFunctions.weakdeps]
    ChainRulesCore = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4"
    ChangesOfVariables = "9e997f8a-9a97-42d5-a9f1-ce6bfc15e2c0"
    InverseFunctions = "3587e190-3f89-42d0-90ee-14403ec27112"
 [[deps.Logging]]
 uuid = "56ddb016-857b-54e1-b83d-db4d58db5568"
 [[deps.MIPLearn]]
 deps = ["Conda", "DataStructures", "HDF5", "HiGHS", "JLD2", "JuMP", "KLU", "LinearAlgebra", "MathOptInterface", "OrderedCollections", "Printf", "PyCall", "Random", "Requires", "SparseArrays", "Statistics", "TimerOutputs"]
 path = "/home/axavier/Packages/MIPLearn.jl/dev/"
 uuid = "2b1277c3-b477-4c49-a15e-7ba350325c68"
 version = "0.3.0"
 [[deps.MPICH_jll]]
 deps = ["Artifacts", "CompilerSupportLibraries_jll", "JLLWrappers", "LazyArtifacts", "Libdl", "MPIPreferences", "TOML"]
 git-tree-sha1 = "d790fbd913f85e8865c55bf4725aff197c5155c8"
 uuid = "7cb0a576-ebde-5e09-9194-50597f1243b4"
 version = "4.1.1+1"
 [[deps.MPIPreferences]]
 deps = ["Libdl", "Preferences"]
 git-tree-sha1 = "d86a788b336e8ae96429c0c42740ccd60ac0dfcc"
 uuid = "3da0fdf6-3ccc-4f1b-acd9-58baa6c99267"
 version = "0.1.8"
 [[deps.MPItrampoline_jll]]
 deps = ["Artifacts", "CompilerSupportLibraries_jll", "JLLWrappers", "LazyArtifacts", "Libdl", "MPIPreferences", "TOML"]
 git-tree-sha1 = "b3dcf8e1c610a10458df3c62038c8cc3a4d6291d"
 uuid = "f1f71cc9-e9ae-5b93-9b94-4fe0e1ad3748"
 version = "5.3.0+0"
 [[deps.MacroTools]]
 deps = ["Markdown", "Random"]
 git-tree-sha1 = "42324d08725e200c23d4dfb549e0d5d89dede2d2"
 uuid = "1914dd2f-81c6-5fcd-8719-6d5c9610ff09"
 version = "0.5.10"
 [[deps.Markdown]]
 deps = ["Base64"]
 uuid = "d6f4376e-aef5-505a-96c1-9c027394607a"
 [[deps.MathOptInterface]]
 deps = ["BenchmarkTools", "CodecBzip2", "CodecZlib", "DataStructures", "ForwardDiff", "JSON", "LinearAlgebra", "MutableArithmetics", "NaNMath", "OrderedCollections", "PrecompileTools", "Printf", "SparseArrays", "SpecialFunctions", "Test", "Unicode"]
 git-tree-sha1 = "19a3636968e802918f8891d729c74bd64dff6d00"
 uuid = "b8f27783-ece8-5eb3-8dc8-9495eed66fee"
 version = "1.17.1"
 [[deps.MbedTLS_jll]]
 deps = ["Artifacts", "Libdl"]
 uuid = "c8ffd9c3-330d-5841-b78e-0817d7145fa1"
 version = "2.28.2+0"
 [[deps.MicrosoftMPI_jll]]
 deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"]
 git-tree-sha1 = "a8027af3d1743b3bfae34e54872359fdebb31422"
 uuid = "9237b28f-5490-5468-be7b-bb81f5f5e6cf"
 version = "10.1.3+4"
 [[deps.Missings]]
 deps = ["DataAPI"]
 git-tree-sha1 = "f66bdc5de519e8f8ae43bdc598782d35a25b1272"
 uuid = "e1d29d7a-bbdc-5cf2-9ac0-f12de2c33e28"
 version = "1.1.0"
 [[deps.Mmap]]
 uuid = "a63ad114-7e13-5084-954f-fe012c677804"
 [[deps.MozillaCACerts_jll]]
 uuid = "14a3606d-f60d-562e-9121-12d972cd8159"
 version = "2022.10.11"
 [[deps.MutableArithmetics]]
 deps = ["LinearAlgebra", "SparseArrays", "Test"]
 git-tree-sha1 = "964cb1a7069723727025ae295408747a0b36a854"
 uuid = "d8a4904e-b15c-11e9-3269-09a3773c0cb0"
 version = "1.3.0"
 [[deps.NaNMath]]
 deps = ["OpenLibm_jll"]
 git-tree-sha1 = "0877504529a3e5c3343c6f8b4c0381e57e4387e4"
 uuid = "77ba4419-2d1f-58cd-9bb1-8ffee604a2e3"
 version = "1.0.2"
 [[deps.NetworkOptions]]
 uuid = "ca575930-c2e3-43a9-ace4-1e988b2c1908"
 version = "1.2.0"
 [[deps.OpenBLAS_jll]]
 deps = ["Artifacts", "CompilerSupportLibraries_jll", "Libdl"]
 uuid = "4536629a-c528-5b80-bd46-f80d51c5b363"
 version = "0.3.21+4"
 [[deps.OpenLibm_jll]]
 deps = ["Artifacts", "Libdl"]
 uuid = "05823500-19ac-5b8b-9628-191a04bc5112"
 version = "0.8.1+0"
 [[deps.OpenMPI_jll]]
 deps = ["Artifacts", "CompilerSupportLibraries_jll", "JLLWrappers", "LazyArtifacts", "Libdl", "MPIPreferences", "TOML"]
 git-tree-sha1 = "f3080f4212a8ba2ceb10a34b938601b862094314"
 uuid = "fe0851c0-eecd-5654-98d4-656369965a5c"
 version = "4.1.5+0"
 [[deps.OpenSSL_jll]]
 deps = ["Artifacts", "JLLWrappers", "Libdl"]
 git-tree-sha1 = "cae3153c7f6cf3f069a853883fd1919a6e5bab5b"
 uuid = "458c3c95-2e84-50aa-8efc-19380b2a3a95"
 version = "3.0.9+0"
 [[deps.OpenSpecFun_jll]]
 deps = ["Artifacts", "CompilerSupportLibraries_jll", "JLLWrappers", "Libdl", "Pkg"]
 git-tree-sha1 = "13652491f6856acfd2db29360e1bbcd4565d04f1"
 uuid = "efe28fd5-8261-553b-a9e1-b2916fc3738e"
 version = "0.5.5+0"
 [[deps.OrderedCollections]]
 git-tree-sha1 = "d321bf2de576bf25ec4d3e4360faca399afca282"
 uuid = "bac558e1-5e72-5ebc-8fee-abe8a469f55d"
 version = "1.6.0"
 [[deps.PDMats]]
 deps = ["LinearAlgebra", "SparseArrays", "SuiteSparse"]
 git-tree-sha1 = "67eae2738d63117a196f497d7db789821bce61d1"
 uuid = "90014a1f-27ba-587c-ab20-58faa44d9150"
 version = "0.11.17"
 [[deps.Parsers]]
 deps = ["Dates", "PrecompileTools", "UUIDs"]
 git-tree-sha1 = "a5aef8d4a6e8d81f171b2bd4be5265b01384c74c"
 uuid = "69de0a69-1ddd-5017-9359-2bf0b02dc9f0"
 version = "2.5.10"
 [[deps.Pkg]]
 deps = ["Artifacts", "Dates", "Downloads", "FileWatching", "LibGit2", "Libdl", "Logging", "Markdown", "Printf", "REPL", "Random", "SHA", "Serialization", "TOML", "Tar", "UUIDs", "p7zip_jll"]
 uuid = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f"
 version = "1.9.0"
 [[deps.PrecompileTools]]
 deps = ["Preferences"]
 git-tree-sha1 = "9673d39decc5feece56ef3940e5dafba15ba0f81"
 uuid = "aea7be01-6a6a-4083-8856-8a6e6704d82a"
 version = "1.1.2"
 [[deps.Preferences]]
 deps = ["TOML"]
 git-tree-sha1 = "7eb1686b4f04b82f96ed7a4ea5890a4f0c7a09f1"
 uuid = "21216c6a-2e73-6563-6e65-726566657250"
 version = "1.4.0"
 [[deps.Printf]]
 deps = ["Unicode"]
 uuid = "de0858da-6303-5e67-8744-51eddeeeb8d7"
 [[deps.Profile]]
 deps = ["Printf"]
 uuid = "9abbd945-dff8-562f-b5e8-e1ebf5ef1b79"
 [[deps.PyCall]]
 deps = ["Conda", "Dates", "Libdl", "LinearAlgebra", "MacroTools", "Serialization", "VersionParsing"]
 git-tree-sha1 = "62f417f6ad727987c755549e9cd88c46578da562"
 uuid = "438e738f-606a-5dbb-bf0a-cddfbfd45ab0"
 version = "1.95.1"
 [[deps.QuadGK]]
 deps = ["DataStructures", "LinearAlgebra"]
 git-tree-sha1 = "6ec7ac8412e83d57e313393220879ede1740f9ee"
 uuid = "1fd47b50-473d-5c70-9696-f719f8f3bcdc"
 version = "2.8.2"
 [[deps.REPL]]
 deps = ["InteractiveUtils", "Markdown", "Sockets", "Unicode"]
 uuid = "3fa0cd96-eef1-5676-8a61-b3b8758bbffb"
 [[deps.Random]]
 deps = ["SHA", "Serialization"]
 uuid = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
 [[deps.Reexport]]
 git-tree-sha1 = "45e428421666073eab6f2da5c9d310d99bb12f9b"
 uuid = "189a3867-3050-52da-a836-e630ba90ab69"
 version = "1.2.2"
 [[deps.Requires]]
 deps = ["UUIDs"]
 git-tree-sha1 = "838a3a4188e2ded87a4f9f184b4b0d78a1e91cb7"
 uuid = "ae029012-a4dd-5104-9daa-d747884805df"
 version = "1.3.0"
 [[deps.Rmath]]
 deps = ["Random", "Rmath_jll"]
 git-tree-sha1 = "f65dcb5fa46aee0cf9ed6274ccbd597adc49aa7b"
 uuid = "79098fc4-a85e-5d69-aa6a-4863f24498fa"
 version = "0.7.1"
 [[deps.Rmath_jll]]
 deps = ["Artifacts", "JLLWrappers", "Libdl", "Pkg"]
 git-tree-sha1 = "6ed52fdd3382cf21947b15e8870ac0ddbff736da"
 uuid = "f50d1b31-88e8-58de-be2c-1cc44531875f"
 version = "0.4.0+0"
 [[deps.SHA]]
 uuid = "ea8e919c-243c-51af-8825-aaa63cd721ce"
 version = "0.7.0"
 [[deps.Serialization]]
 uuid = "9e88b42a-f829-5b0c-bbe9-9e923198166b"
 [[deps.SnoopPrecompile]]
 deps = ["Preferences"]
 git-tree-sha1 = "e760a70afdcd461cf01a575947738d359234665c"
 uuid = "66db9d55-30c0-4569-8b51-7e840670fc0c"
 version = "1.0.3"
 [[deps.Sockets]]
 uuid = "6462fe0b-24de-5631-8697-dd941f90decc"
 [[deps.SortingAlgorithms]]
 deps = ["DataStructures"]
 git-tree-sha1 = "a4ada03f999bd01b3a25dcaa30b2d929fe537e00"
 uuid = "a2af1166-a08f-5f64-846c-94a0d3cef48c"
 version = "1.1.0"
 [[deps.SparseArrays]]
 deps = ["Libdl", "LinearAlgebra", "Random", "Serialization", "SuiteSparse_jll"]
 uuid = "2f01184e-e22b-5df5-ae63-d93ebab69eaf"
 [[deps.SpecialFunctions]]
 deps = ["IrrationalConstants", "LogExpFunctions", "OpenLibm_jll", "OpenSpecFun_jll"]
 git-tree-sha1 = "ef28127915f4229c971eb43f3fc075dd3fe91880"
 uuid = "276daf66-3868-5448-9aa4-cd146d93841b"
 version = "2.2.0"
    [deps.SpecialFunctions.extensions]
    SpecialFunctionsChainRulesCoreExt = "ChainRulesCore"
    [deps.SpecialFunctions.weakdeps]
    ChainRulesCore = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4"
 [[deps.StaticArraysCore]]
 git-tree-sha1 = "6b7ba252635a5eff6a0b0664a41ee140a1c9e72a"
 uuid = "1e83bf80-4336-4d27-bf5d-d5a4f845583c"
 version = "1.4.0"
 [[deps.Statistics]]
 deps = ["LinearAlgebra", "SparseArrays"]
 uuid = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"
 version = "1.9.0"
 [[deps.StatsAPI]]
 deps = ["LinearAlgebra"]
 git-tree-sha1 = "45a7769a04a3cf80da1c1c7c60caf932e6f4c9f7"
 uuid = "82ae8749-77ed-4fe6-ae5f-f523153014b0"
 version = "1.6.0"
 [[deps.StatsBase]]
 deps = ["DataAPI", "DataStructures", "LinearAlgebra", "LogExpFunctions", "Missings", "Printf", "Random", "SortingAlgorithms", "SparseArrays", "Statistics", "StatsAPI"]
 git-tree-sha1 = "75ebe04c5bed70b91614d684259b661c9e6274a4"
 uuid = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"
 version = "0.34.0"
 [[deps.StatsFuns]]
 deps = ["HypergeometricFunctions", "IrrationalConstants", "LogExpFunctions", "Reexport", "Rmath", "SpecialFunctions"]
 git-tree-sha1 = "f625d686d5a88bcd2b15cd81f18f98186fdc0c9a"
 uuid = "4c63d2b9-4356-54db-8cca-17b64c39e42c"
 version = "1.3.0"
    [deps.StatsFuns.extensions]
    StatsFunsChainRulesCoreExt = "ChainRulesCore"
    StatsFunsInverseFunctionsExt = "InverseFunctions"
    [deps.StatsFuns.weakdeps]
    ChainRulesCore = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4"
    InverseFunctions = "3587e190-3f89-42d0-90ee-14403ec27112"
 [[deps.SuiteSparse]]
 deps = ["Libdl", "LinearAlgebra", "Serialization", "SparseArrays"]
 uuid = "4607b0f0-06f3-5cda-b6b1-a6196a1729e9"
 [[deps.SuiteSparse_jll]]
 deps = ["Artifacts", "Libdl", "Pkg", "libblastrampoline_jll"]
 uuid = "bea87d4a-7f5b-5778-9afe-8cc45184846c"
 version = "5.10.1+6"
 [[deps.Suppressor]]
 git-tree-sha1 = "c6ed566db2fe3931292865b966d6d140b7ef32a9"
 uuid = "fd094767-a336-5f1f-9728-57cf17d0bbfb"
 version = "0.2.1"
 [[deps.TOML]]
 deps = ["Dates"]
 uuid = "fa267f1f-6049-4f14-aa54-33bafae1ed76"
 version = "1.0.3"
 [[deps.Tar]]
 deps = ["ArgTools", "SHA"]
 uuid = "a4e569a6-e804-4fa4-b0f3-eef7a1d5b13e"
 version = "1.10.0"
 [[deps.Test]]
 deps = ["InteractiveUtils", "Logging", "Random", "Serialization"]
 uuid = "8dfed614-e22c-5e08-85e1-65c5234f0b40"
 [[deps.TimerOutputs]]
 deps = ["ExprTools", "Printf"]
 git-tree-sha1 = "f548a9e9c490030e545f72074a41edfd0e5bcdd7"
 uuid = "a759f4b9-e2f1-59dc-863e-4aeb61b1ea8f"
 version = "0.5.23"
 [[deps.TranscodingStreams]]
 deps = ["Random", "Test"]
 git-tree-sha1 = "9a6ae7ed916312b41236fcef7e0af564ef934769"
 uuid = "3bb67fe8-82b1-5028-8e26-92a6c54297fa"
 version = "0.9.13"
 [[deps.UUIDs]]
 deps = ["Random", "SHA"]
 uuid = "cf7118a7-6976-5b1a-9a39-7adc72f591a4"
 [[deps.Unicode]]
 uuid = "4ec0a83e-493e-50e2-b9ac-8f72acf5a8f5"
 [[deps.VersionParsing]]
 git-tree-sha1 = "58d6e80b4ee071f5efd07fda82cb9fbe17200868"
 uuid = "81def892-9a0e-5fdd-b105-ffc91e053289"
 version = "1.3.0"
 [[deps.Zlib_jll]]
 deps = ["Libdl"]
 uuid = "83775a58-1f1d-513f-b197-d71354ab007a"
 version = "1.2.13+0"
 [[deps.libaec_jll]]
 deps = ["Artifacts", "JLLWrappers", "Libdl"]
 git-tree-sha1 = "eddd19a8dea6b139ea97bdc8a0e2667d4b661720"
 uuid = "477f73a3-ac25-53e9-8cc3-50b2fa2566f0"
 version = "1.0.6+1"
 [[deps.libblastrampoline_jll]]
 deps = ["Artifacts", "Libdl"]
 uuid = "8e850b90-86db-534c-a0d3-1478176c7d93"
 version = "5.7.0+0"
 [[deps.nghttp2_jll]]
 deps = ["Artifacts", "Libdl"]
 uuid = "8e850ede-7688-5339-a07c-302acd2aaf8d"
 version = "1.48.0+0"
 [[deps.p7zip_jll]]
 deps = ["Artifacts", "Libdl"]
 uuid = "3f19e933-33d8-53b3-aaab-bd5110c3b7a0"
 version = "17.4.0+0"
--- a/docs/tutorials/Project.toml
+++ b/docs/tutorials/Project.toml
@@ -0,0 +1,7 @@
 [deps]
 Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"
 Gurobi = "2e9cd046-0924-5485-92f1-d5272153d98b"
 JuMP = "4076af6c-e467-56ae-b986-b466b2749572"
 MIPLearn = "2b1277c3-b477-4c49-a15e-7ba350325c68"
 PyCall = "438e738f-606a-5dbb-bf0a-cddfbfd45ab0"
 Suppressor = "fd094767-a336-5f1f-9728-57cf17d0bbfb"
--- a/docs/tutorials/getting-started-gurobipy.ipynb
+++ b/docs/tutorials/getting-started-gurobipy.ipynb
@@ -0,0 +1,849 @@
 {
 "cells": [
  {
   "cell_type": "markdown",
   "id": "6b8983b1",
   "metadata": {
    "tags": []
   },
   "source": [
    "# Getting started (Gurobipy)\n",
    "\n",
    "## Introduction\n",
    "\n",
    "**MIPLearn** is an open source framework that uses machine learning (ML) to accelerate the performance of mixed-integer programming solvers (e.g. Gurobi, CPLEX, XPRESS). In this tutorial, we will:\n",
    "\n",
    "1. Install the Python/Gurobipy version of MIPLearn\n",
    "2. Model a simple optimization problem using Gurobipy\n",
    "3. Generate training data and train the ML models\n",
    "4. Use the ML models together Gurobi to solve new instances\n",
    "\n",
    "<div class=\"alert alert-info\">\n",
    "Note\n",
    "    \n",
    "The Python/Gurobipy version of MIPLearn is only compatible with the Gurobi Optimizer. For broader solver compatibility, see the Python/Pyomo and Julia/JuMP versions of the package.\n",
    "</div>\n",
    "\n",
    "<div class=\"alert alert-warning\">\n",
    "Warning\n",
    "    \n",
    "MIPLearn is still in early development stage. If run into any bugs or issues, please submit a bug report in our GitHub repository. Comments, suggestions and pull requests are also very welcome!\n",
    "    \n",
    "</div>\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "02f0a927",
   "metadata": {},
   "source": [
    "## Installation\n",
    "\n",
    "MIPLearn is available in two versions:\n",
    "\n",
    "- Python version, compatible with the Pyomo and Gurobipy modeling languages,\n",
    "- Julia version, compatible with the JuMP modeling language.\n",
    "\n",
    "In this tutorial, we will demonstrate how to use and install the Python/Gurobipy version of the package. The first step is to install Python 3.8+ in your computer. See the [official Python website for more instructions](https://www.python.org/downloads/). After Python is installed, we proceed to install MIPLearn using `pip`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "cd8a69c1",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-06T20:18:02.381829278Z",
     "start_time": "2023-06-06T20:18:02.381532300Z"
    }
   },
   "outputs": [],
   "source": [
    "# !pip install MIPLearn==0.3.0"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e8274543",
   "metadata": {},
   "source": [
    "In addition to MIPLearn itself, we will also install Gurobi 10.0, a state-of-the-art commercial MILP solver. This step also install a demo license for Gurobi, which should able to solve the small optimization problems in this tutorial. A license is required for solving larger-scale problems."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "dcc8756c",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-06T20:18:15.537811992Z",
     "start_time": "2023-06-06T20:18:13.449177860Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Requirement already satisfied: gurobipy<10.1,>=10 in /home/axavier/Software/anaconda3/envs/miplearn/lib/python3.8/site-packages (10.0.1)\n"
     ]
    }
   ],
   "source": [
    "!pip install 'gurobipy>=10,<10.1'"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a14e4550",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-info\">\n",
    "    \n",
    "Note\n",
    "    \n",
    "In the code above, we install specific version of all packages to ensure that this tutorial keeps running in the future, even when newer (and possibly incompatible) versions of the packages are released. This is usually a recommended practice for all Python projects.\n",
    "    \n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "16b86823",
   "metadata": {},
   "source": [
    "## Modeling a simple optimization problem\n",
    "\n",
    "To illustrate how can MIPLearn be used, we will model and solve a small optimization problem related to power systems optimization. The problem we discuss below is a simplification of the **unit commitment problem,** a practical optimization problem solved daily by electric grid operators around the world. \n",
    "\n",
    "Suppose that a utility company needs to decide which electrical generators should be online at each hour of the day, as well as how much power should each generator produce. More specifically, assume that the company owns $n$ generators, denoted by $g_1, \\ldots, g_n$. Each generator can either be online or offline. An online generator $g_i$ can produce between $p^\\text{min}_i$ to $p^\\text{max}_i$ megawatts of power, and it costs the company $c^\\text{fix}_i + c^\\text{var}_i y_i$, where $y_i$ is the amount of power produced. An offline generator produces nothing and costs nothing. The total amount of power to be produced needs to be exactly equal to the total demand $d$ (in megawatts).\n",
    "\n",
    "This simple problem can be modeled as a *mixed-integer linear optimization* problem as follows. For each generator $g_i$, let $x_i \\in \\{0,1\\}$ be a decision variable indicating whether $g_i$ is online, and let $y_i \\geq 0$ be a decision variable indicating how much power does $g_i$ produce. The problem is then given by:"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f12c3702",
   "metadata": {},
   "source": [
    "$$\n",
    "\\begin{align}\n",
    "\\text{minimize } \\quad & \\sum_{i=1}^n \\left( c^\\text{fix}_i x_i + c^\\text{var}_i y_i \\right) \\\\\n",
    "\\text{subject to } \\quad & y_i \\leq p^\\text{max}_i x_i & i=1,\\ldots,n \\\\\n",
    "& y_i \\geq p^\\text{min}_i x_i & i=1,\\ldots,n \\\\\n",
    "& \\sum_{i=1}^n y_i = d \\\\\n",
    "& x_i \\in \\{0,1\\} & i=1,\\ldots,n \\\\\n",
    "& y_i \\geq 0 & i=1,\\ldots,n\n",
    "\\end{align}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "be3989ed",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-info\">\n",
    "\n",
    "Note\n",
    "\n",
    "We use a simplified version of the unit commitment problem in this tutorial just to make it easier to follow. MIPLearn can also handle realistic, large-scale versions of this problem.\n",
    "\n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a5fd33f6",
   "metadata": {},
   "source": [
    "Next, let us convert this abstract mathematical formulation into a concrete optimization model, using Python and Pyomo. We start by defining a data class `UnitCommitmentData`, which holds all the input data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "22a67170-10b4-43d3-8708-014d91141e73",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-06T20:18:25.442346786Z",
     "start_time": "2023-06-06T20:18:25.329017476Z"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "from dataclasses import dataclass\n",
    "from typing import List\n",
    "\n",
    "import numpy as np\n",
    "\n",
    "\n",
    "@dataclass\n",
    "class UnitCommitmentData:\n",
    "    demand: float\n",
    "    pmin: List[float]\n",
    "    pmax: List[float]\n",
    "    cfix: List[float]\n",
    "    cvar: List[float]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "29f55efa-0751-465a-9b0a-a821d46a3d40",
   "metadata": {},
   "source": [
    "Next, we write a `build_uc_model` function, which converts the input data into a concrete Pyomo model. The function accepts `UnitCommitmentData`, the data structure we previously defined, or the path to a compressed pickle file containing this data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "2f67032f-0d74-4317-b45c-19da0ec859e9",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-06T20:48:05.953902842Z",
     "start_time": "2023-06-06T20:48:05.909747925Z"
    }
   },
   "outputs": [],
   "source": [
    "import gurobipy as gp\n",
    "from gurobipy import GRB, quicksum\n",
    "from typing import Union\n",
    "from miplearn.io import read_pkl_gz\n",
    "from miplearn.solvers.gurobi import GurobiModel\n",
    "\n",
    "def build_uc_model(data: Union[str, UnitCommitmentData]) -> GurobiModel:\n",
    "    if isinstance(data, str):\n",
    "        data = read_pkl_gz(data)\n",
    "\n",
    "    model = gp.Model()\n",
    "    n = len(data.pmin)\n",
    "    x = model._x = model.addVars(n, vtype=GRB.BINARY, name=\"x\")\n",
    "    y = model._y = model.addVars(n, name=\"y\")\n",
    "    model.setObjective(\n",
    "        quicksum(\n",
    "            data.cfix[i] * x[i] + data.cvar[i] * y[i] for i in range(n)\n",
    "        )\n",
    "    )\n",
    "    model.addConstrs(y[i] <= data.pmax[i] * x[i] for i in range(n))\n",
    "    model.addConstrs(y[i] >= data.pmin[i] * x[i] for i in range(n))\n",
    "    model.addConstr(quicksum(y[i] for i in range(n)) == data.demand)\n",
    "    return GurobiModel(model)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c22714a3",
   "metadata": {},
   "source": [
    "At this point, we can already use Pyomo and any mixed-integer linear programming solver to find optimal solutions to any instance of this problem. To illustrate this, let us solve a small instance with three generators:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "2a896f47",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-06T20:49:14.266758244Z",
     "start_time": "2023-06-06T20:49:14.223514806Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Restricted license - for non-production use only - expires 2024-10-28\n",
      "Gurobi Optimizer version 10.0.1 build v10.0.1rc0 (linux64)\n",
      "\n",
      "CPU model: Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz, instruction set [SSE2|AVX|AVX2]\n",
      "Thread count: 6 physical cores, 12 logical processors, using up to 12 threads\n",
      "\n",
      "Optimize a model with 7 rows, 6 columns and 15 nonzeros\n",
      "Model fingerprint: 0x58dfdd53\n",
      "Variable types: 3 continuous, 3 integer (3 binary)\n",
      "Coefficient statistics:\n",
      "  Matrix range     [1e+00, 7e+01]\n",
      "  Objective range  [2e+00, 7e+02]\n",
      "  Bounds range     [1e+00, 1e+00]\n",
      "  RHS range        [1e+02, 1e+02]\n",
      "Presolve removed 2 rows and 1 columns\n",
      "Presolve time: 0.00s\n",
      "Presolved: 5 rows, 5 columns, 13 nonzeros\n",
      "Variable types: 0 continuous, 5 integer (3 binary)\n",
      "Found heuristic solution: objective 1400.0000000\n",
      "\n",
      "Root relaxation: objective 1.035000e+03, 3 iterations, 0.00 seconds (0.00 work units)\n",
      "\n",
      "    Nodes    |    Current Node    |     Objective Bounds      |     Work\n",
      " Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time\n",
      "\n",
      "     0     0 1035.00000    0    1 1400.00000 1035.00000  26.1%     -    0s\n",
      "     0     0 1105.71429    0    1 1400.00000 1105.71429  21.0%     -    0s\n",
      "*    0     0               0    1320.0000000 1320.00000  0.00%     -    0s\n",
      "\n",
      "Explored 1 nodes (5 simplex iterations) in 0.01 seconds (0.00 work units)\n",
      "Thread count was 12 (of 12 available processors)\n",
      "\n",
      "Solution count 2: 1320 1400 \n",
      "\n",
      "Optimal solution found (tolerance 1.00e-04)\n",
      "Best objective 1.320000000000e+03, best bound 1.320000000000e+03, gap 0.0000%\n",
      "obj = 1320.0\n",
      "x = [-0.0, 1.0, 1.0]\n",
      "y = [0.0, 60.0, 40.0]\n"
     ]
    }
   ],
   "source": [
    "model = build_uc_model(\n",
    "    UnitCommitmentData(\n",
    "        demand=100.0,\n",
    "        pmin=[10, 20, 30],\n",
    "        pmax=[50, 60, 70],\n",
    "        cfix=[700, 600, 500],\n",
    "        cvar=[1.5, 2.0, 2.5],\n",
    "    )\n",
    ")\n",
    "\n",
    "model.optimize()\n",
    "print(\"obj =\", model.inner.objVal)\n",
    "print(\"x =\", [model.inner._x[i].x for i in range(3)])\n",
    "print(\"y =\", [model.inner._y[i].x for i in range(3)])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "41b03bbc",
   "metadata": {},
   "source": [
    "Running the code above, we found that the optimal solution for our small problem instance costs \\$1320. It is achieve by keeping generators 2 and 3 online and producing, respectively, 60 MW and 40 MW of power."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "01f576e1-1790-425e-9e5c-9fa07b6f4c26",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-info\">\n",
    "    \n",
    "Note\n",
    "\n",
    "- In the example above, `GurobiModel` is just a thin wrapper around a standard Gurobi model. This wrapper allows MIPLearn to be solver- and modeling-language-agnostic. The wrapper provides only a few basic methods, such as `optimize`. For more control, and to query the solution, the original Gurobi model can be accessed through `model.inner`, as illustrated above.\n",
    "- To ensure training data consistency, MIPLearn requires all decision variables to have names.\n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cf60c1dd",
   "metadata": {},
   "source": [
    "## Generating training data\n",
    "\n",
    "Although Gurobi could solve the small example above in a fraction of a second, it gets slower for larger and more complex versions of the problem. If this is a problem that needs to be solved frequently, as it is often the case in practice, it could make sense to spend some time upfront generating a **trained** solver, which can optimize new instances (similar to the ones it was trained on) faster.\n",
    "\n",
    "In the following, we will use MIPLearn to train machine learning models that is able to predict the optimal solution for instances that follow a given probability distribution, then it will provide this predicted solution to Gurobi as a warm start. Before we can train the model, we need to collect training data by solving a large number of instances. In real-world situations, we may construct these training instances based on historical data. In this tutorial, we will construct them using a random instance generator:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "5eb09fab",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-06T20:49:22.758192368Z",
     "start_time": "2023-06-06T20:49:22.724784572Z"
    }
   },
   "outputs": [],
   "source": [
    "from scipy.stats import uniform\n",
    "from typing import List\n",
    "import random\n",
    "\n",
    "\n",
    "def random_uc_data(samples: int, n: int, seed: int = 42) -> List[UnitCommitmentData]:\n",
    "    random.seed(seed)\n",
    "    np.random.seed(seed)\n",
    "    pmin = uniform(loc=100_000.0, scale=400_000.0).rvs(n)\n",
    "    pmax = pmin * uniform(loc=2.0, scale=2.5).rvs(n)\n",
    "    cfix = pmin * uniform(loc=100.0, scale=25.0).rvs(n)\n",
    "    cvar = uniform(loc=1.25, scale=0.25).rvs(n)\n",
    "    return [\n",
    "        UnitCommitmentData(\n",
    "            demand=pmax.sum() * uniform(loc=0.5, scale=0.25).rvs(),\n",
    "            pmin=pmin,\n",
    "            pmax=pmax,\n",
    "            cfix=cfix,\n",
    "            cvar=cvar,\n",
    "        )\n",
    "        for _ in range(samples)\n",
    "    ]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3a03a7ac",
   "metadata": {},
   "source": [
    "In this example, for simplicity, only the demands change from one instance to the next. We could also have randomized the costs, production limits or even the number of units. The more randomization we have in the training data, however, the more challenging it is for the machine learning models to learn solution patterns.\n",
    "\n",
    "Now we generate 500 instances of this problem, each one with 50 generators, and we use 450 of these instances for training. After generating the instances, we write them to individual files. MIPLearn uses files during the training process because, for large-scale optimization problems, it is often impractical to hold in memory the entire training data, as well as the concrete Pyomo models. Files also make it much easier to solve multiple instances simultaneously, potentially on multiple machines. The code below generates the files `uc/train/00000.pkl.gz`, `uc/train/00001.pkl.gz`, etc., which contain the input data in compressed (gzipped) pickle format."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "6156752c",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-06T20:49:24.811192929Z",
     "start_time": "2023-06-06T20:49:24.575639142Z"
    }
   },
   "outputs": [],
   "source": [
    "from miplearn.io import write_pkl_gz\n",
    "\n",
    "data = random_uc_data(samples=500, n=500)\n",
    "train_data = write_pkl_gz(data[0:450], \"uc/train\")\n",
    "test_data = write_pkl_gz(data[450:500], \"uc/test\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b17af877",
   "metadata": {},
   "source": [
    "Finally, we use `BasicCollector` to collect the optimal solutions and other useful training data for all training instances. The data is stored in HDF5 files `uc/train/00000.h5`, `uc/train/00001.h5`, etc. The optimization models are also exported to compressed MPS files `uc/train/00000.mps.gz`, `uc/train/00001.mps.gz`, etc."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "7623f002",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-06T20:49:34.936729253Z",
     "start_time": "2023-06-06T20:49:25.936126612Z"
    }
   },
   "outputs": [],
   "source": [
    "from miplearn.collectors.basic import BasicCollector\n",
    "\n",
    "bc = BasicCollector()\n",
    "bc.collect(train_data, build_uc_model, n_jobs=4)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c42b1be1-9723-4827-82d8-974afa51ef9f",
   "metadata": {},
   "source": [
    "## Training and solving test instances"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a33c6aa4-f0b8-4ccb-9935-01f7d7de2a1c",
   "metadata": {},
   "source": [
    "With training data in hand, we can now design and train a machine learning model to accelerate solver performance. In this tutorial, for illustration purposes, we will use ML to generate a good warm start using $k$-nearest neighbors. More specifically, the strategy is to:\n",
    "\n",
    "1. Memorize the optimal solutions of all training instances;\n",
    "2. Given a test instance, find the 25 most similar training instances, based on constraint right-hand sides;\n",
    "3. Merge their optimal solutions into a single partial solution; specifically, only assign values to the binary variables that agree unanimously.\n",
    "4. Provide this partial solution to the solver as a warm start.\n",
    "\n",
    "This simple strategy can be implemented as shown below, using `MemorizingPrimalComponent`. For more advanced strategies, and for the usage of more advanced classifiers, see the user guide."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "435f7bf8-4b09-4889-b1ec-b7b56e7d8ed2",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-06T20:49:38.997939600Z",
     "start_time": "2023-06-06T20:49:38.968261432Z"
    }
   },
   "outputs": [],
   "source": [
    "from sklearn.neighbors import KNeighborsClassifier\n",
    "from miplearn.components.primal.actions import SetWarmStart\n",
    "from miplearn.components.primal.mem import (\n",
    "    MemorizingPrimalComponent,\n",
    "    MergeTopSolutions,\n",
    ")\n",
    "from miplearn.extractors.fields import H5FieldsExtractor\n",
    "\n",
    "comp = MemorizingPrimalComponent(\n",
    "    clf=KNeighborsClassifier(n_neighbors=25),\n",
    "    extractor=H5FieldsExtractor(\n",
    "        instance_fields=[\"static_constr_rhs\"],\n",
    "    ),\n",
    "    constructor=MergeTopSolutions(25, [0.0, 1.0]),\n",
    "    action=SetWarmStart(),\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9536e7e4-0b0d-49b0-bebd-4a848f839e94",
   "metadata": {},
   "source": [
    "Having defined the ML strategy, we next construct `LearningSolver`, train the ML component and optimize one of the test instances."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "9d13dd50-3dcf-4673-a757-6f44dcc0dedf",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-06T20:49:42.072345411Z",
     "start_time": "2023-06-06T20:49:41.294040974Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Gurobi Optimizer version 10.0.1 build v10.0.1rc0 (linux64)\n",
      "\n",
      "CPU model: Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz, instruction set [SSE2|AVX|AVX2]\n",
      "Thread count: 6 physical cores, 12 logical processors, using up to 12 threads\n",
      "\n",
      "Optimize a model with 1001 rows, 1000 columns and 2500 nonzeros\n",
      "Model fingerprint: 0xa8b70287\n",
      "Coefficient statistics:\n",
      "  Matrix range     [1e+00, 2e+06]\n",
      "  Objective range  [1e+00, 6e+07]\n",
      "  Bounds range     [1e+00, 1e+00]\n",
      "  RHS range        [3e+08, 3e+08]\n",
      "Presolve removed 1000 rows and 500 columns\n",
      "Presolve time: 0.00s\n",
      "Presolved: 1 rows, 500 columns, 500 nonzeros\n",
      "\n",
      "Iteration    Objective       Primal Inf.    Dual Inf.      Time\n",
      "       0    6.6166537e+09   5.648803e+04   0.000000e+00      0s\n",
      "       1    8.2906219e+09   0.000000e+00   0.000000e+00      0s\n",
      "\n",
      "Solved in 1 iterations and 0.01 seconds (0.00 work units)\n",
      "Optimal objective  8.290621916e+09\n",
      "Gurobi Optimizer version 10.0.1 build v10.0.1rc0 (linux64)\n",
      "\n",
      "CPU model: Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz, instruction set [SSE2|AVX|AVX2]\n",
      "Thread count: 6 physical cores, 12 logical processors, using up to 12 threads\n",
      "\n",
      "Optimize a model with 1001 rows, 1000 columns and 2500 nonzeros\n",
      "Model fingerprint: 0x4ccd7ae3\n",
      "Variable types: 500 continuous, 500 integer (500 binary)\n",
      "Coefficient statistics:\n",
      "  Matrix range     [1e+00, 2e+06]\n",
      "  Objective range  [1e+00, 6e+07]\n",
      "  Bounds range     [1e+00, 1e+00]\n",
      "  RHS range        [3e+08, 3e+08]\n",
      "\n",
      "User MIP start produced solution with objective 8.30129e+09 (0.01s)\n",
      "User MIP start produced solution with objective 8.29184e+09 (0.01s)\n",
      "User MIP start produced solution with objective 8.29146e+09 (0.01s)\n",
      "User MIP start produced solution with objective 8.29146e+09 (0.01s)\n",
      "Loaded user MIP start with objective 8.29146e+09\n",
      "\n",
      "Presolve time: 0.00s\n",
      "Presolved: 1001 rows, 1000 columns, 2500 nonzeros\n",
      "Variable types: 500 continuous, 500 integer (500 binary)\n",
      "\n",
      "Root relaxation: objective 8.290622e+09, 512 iterations, 0.00 seconds (0.00 work units)\n",
      "\n",
      "    Nodes    |    Current Node    |     Objective Bounds      |     Work\n",
      " Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time\n",
      "\n",
      "     0     0 8.2906e+09    0    1 8.2915e+09 8.2906e+09  0.01%     -    0s\n",
      "\n",
      "Cutting planes:\n",
      "  Cover: 1\n",
      "  Flow cover: 2\n",
      "\n",
      "Explored 1 nodes (512 simplex iterations) in 0.07 seconds (0.01 work units)\n",
      "Thread count was 12 (of 12 available processors)\n",
      "\n",
      "Solution count 3: 8.29146e+09 8.29184e+09 8.30129e+09 \n",
      "\n",
      "Optimal solution found (tolerance 1.00e-04)\n",
      "Best objective 8.291459497797e+09, best bound 8.290645029670e+09, gap 0.0098%\n"
     ]
    }
   ],
   "source": [
    "from miplearn.solvers.learning import LearningSolver\n",
    "\n",
    "solver_ml = LearningSolver(components=[comp])\n",
    "solver_ml.fit(train_data)\n",
    "solver_ml.optimize(test_data[0], build_uc_model);"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "61da6dad-7f56-4edb-aa26-c00eb5f946c0",
   "metadata": {},
   "source": [
    "By examining the solve log above, specifically the line `Loaded user MIP start with objective...`, we can see that MIPLearn was able to construct an initial solution which turned out to be very close to the optimal solution to the problem. Now let us repeat the code above, but a solver which does not apply any ML strategies. Note that our previously-defined component is not provided."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "2ff391ed-e855-4228-aa09-a7641d8c2893",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-06T20:49:44.012782276Z",
     "start_time": "2023-06-06T20:49:43.813974362Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Gurobi Optimizer version 10.0.1 build v10.0.1rc0 (linux64)\n",
      "\n",
      "CPU model: Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz, instruction set [SSE2|AVX|AVX2]\n",
      "Thread count: 6 physical cores, 12 logical processors, using up to 12 threads\n",
      "\n",
      "Optimize a model with 1001 rows, 1000 columns and 2500 nonzeros\n",
      "Model fingerprint: 0xa8b70287\n",
      "Coefficient statistics:\n",
      "  Matrix range     [1e+00, 2e+06]\n",
      "  Objective range  [1e+00, 6e+07]\n",
      "  Bounds range     [1e+00, 1e+00]\n",
      "  RHS range        [3e+08, 3e+08]\n",
      "Presolve removed 1000 rows and 500 columns\n",
      "Presolve time: 0.00s\n",
      "Presolved: 1 rows, 500 columns, 500 nonzeros\n",
      "\n",
      "Iteration    Objective       Primal Inf.    Dual Inf.      Time\n",
      "       0    6.6166537e+09   5.648803e+04   0.000000e+00      0s\n",
      "       1    8.2906219e+09   0.000000e+00   0.000000e+00      0s\n",
      "\n",
      "Solved in 1 iterations and 0.01 seconds (0.00 work units)\n",
      "Optimal objective  8.290621916e+09\n",
      "Gurobi Optimizer version 10.0.1 build v10.0.1rc0 (linux64)\n",
      "\n",
      "CPU model: Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz, instruction set [SSE2|AVX|AVX2]\n",
      "Thread count: 6 physical cores, 12 logical processors, using up to 12 threads\n",
      "\n",
      "Optimize a model with 1001 rows, 1000 columns and 2500 nonzeros\n",
      "Model fingerprint: 0x4cbbf7c7\n",
      "Variable types: 500 continuous, 500 integer (500 binary)\n",
      "Coefficient statistics:\n",
      "  Matrix range     [1e+00, 2e+06]\n",
      "  Objective range  [1e+00, 6e+07]\n",
      "  Bounds range     [1e+00, 1e+00]\n",
      "  RHS range        [3e+08, 3e+08]\n",
      "Presolve time: 0.00s\n",
      "Presolved: 1001 rows, 1000 columns, 2500 nonzeros\n",
      "Variable types: 500 continuous, 500 integer (500 binary)\n",
      "Found heuristic solution: objective 9.757128e+09\n",
      "\n",
      "Root relaxation: objective 8.290622e+09, 512 iterations, 0.00 seconds (0.00 work units)\n",
      "\n",
      "    Nodes    |    Current Node    |     Objective Bounds      |     Work\n",
      " Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time\n",
      "\n",
      "     0     0 8.2906e+09    0    1 9.7571e+09 8.2906e+09  15.0%     -    0s\n",
      "H    0     0                    8.298273e+09 8.2906e+09  0.09%     -    0s\n",
      "     0     0 8.2907e+09    0    4 8.2983e+09 8.2907e+09  0.09%     -    0s\n",
      "     0     0 8.2907e+09    0    1 8.2983e+09 8.2907e+09  0.09%     -    0s\n",
      "     0     0 8.2907e+09    0    4 8.2983e+09 8.2907e+09  0.09%     -    0s\n",
      "H    0     0                    8.293980e+09 8.2907e+09  0.04%     -    0s\n",
      "     0     0 8.2907e+09    0    5 8.2940e+09 8.2907e+09  0.04%     -    0s\n",
      "     0     0 8.2907e+09    0    1 8.2940e+09 8.2907e+09  0.04%     -    0s\n",
      "     0     0 8.2907e+09    0    2 8.2940e+09 8.2907e+09  0.04%     -    0s\n",
      "     0     0 8.2908e+09    0    1 8.2940e+09 8.2908e+09  0.04%     -    0s\n",
      "     0     0 8.2908e+09    0    4 8.2940e+09 8.2908e+09  0.04%     -    0s\n",
      "     0     0 8.2908e+09    0    4 8.2940e+09 8.2908e+09  0.04%     -    0s\n",
      "H    0     0                    8.291465e+09 8.2908e+09  0.01%     -    0s\n",
      "\n",
      "Cutting planes:\n",
      "  Gomory: 2\n",
      "  MIR: 1\n",
      "\n",
      "Explored 1 nodes (1031 simplex iterations) in 0.07 seconds (0.03 work units)\n",
      "Thread count was 12 (of 12 available processors)\n",
      "\n",
      "Solution count 4: 8.29147e+09 8.29398e+09 8.29827e+09 9.75713e+09 \n",
      "\n",
      "Optimal solution found (tolerance 1.00e-04)\n",
      "Best objective 8.291465302389e+09, best bound 8.290781665333e+09, gap 0.0082%\n"
     ]
    }
   ],
   "source": [
    "solver_baseline = LearningSolver(components=[])\n",
    "solver_baseline.fit(train_data)\n",
    "solver_baseline.optimize(test_data[0], build_uc_model);"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b6d37b88-9fcc-43ee-ac1e-2a7b1e51a266",
   "metadata": {},
   "source": [
    "In the log above, the `MIP start` line is missing, and Gurobi had to start with a significantly inferior initial solution. The solver was still able to find the optimal solution at the end, but it required using its own internal heuristic procedures. In this example, because we solve very small optimization problems, there was almost no difference in terms of running time, but the difference can be significant for larger problems."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eec97f06",
   "metadata": {
    "tags": []
   },
   "source": [
    "## Accessing the solution\n",
    "\n",
    "In the example above, we used `LearningSolver.solve` together with data files to solve both the training and the test instances. The optimal solutions were saved to HDF5 files in the train/test folders, and could be retrieved by reading theses files, but that is not very convenient. In the following example, we show how to build and solve a Pyomo model entirely in-memory, using our trained solver."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "67a6cd18",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-06T20:50:12.869892930Z",
     "start_time": "2023-06-06T20:50:12.509410473Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Gurobi Optimizer version 10.0.1 build v10.0.1rc0 (linux64)\n",
      "\n",
      "CPU model: Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz, instruction set [SSE2|AVX|AVX2]\n",
      "Thread count: 6 physical cores, 12 logical processors, using up to 12 threads\n",
      "\n",
      "Optimize a model with 1001 rows, 1000 columns and 2500 nonzeros\n",
      "Model fingerprint: 0x19042f12\n",
      "Coefficient statistics:\n",
      "  Matrix range     [1e+00, 2e+06]\n",
      "  Objective range  [1e+00, 6e+07]\n",
      "  Bounds range     [1e+00, 1e+00]\n",
      "  RHS range        [3e+08, 3e+08]\n",
      "Presolve removed 1000 rows and 500 columns\n",
      "Presolve time: 0.00s\n",
      "Presolved: 1 rows, 500 columns, 500 nonzeros\n",
      "\n",
      "Iteration    Objective       Primal Inf.    Dual Inf.      Time\n",
      "       0    6.5917580e+09   5.627453e+04   0.000000e+00      0s\n",
      "       1    8.2535968e+09   0.000000e+00   0.000000e+00      0s\n",
      "\n",
      "Solved in 1 iterations and 0.01 seconds (0.00 work units)\n",
      "Optimal objective  8.253596777e+09\n",
      "Gurobi Optimizer version 10.0.1 build v10.0.1rc0 (linux64)\n",
      "\n",
      "CPU model: Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz, instruction set [SSE2|AVX|AVX2]\n",
      "Thread count: 6 physical cores, 12 logical processors, using up to 12 threads\n",
      "\n",
      "Optimize a model with 1001 rows, 1000 columns and 2500 nonzeros\n",
      "Model fingerprint: 0x8ee64638\n",
      "Variable types: 500 continuous, 500 integer (500 binary)\n",
      "Coefficient statistics:\n",
      "  Matrix range     [1e+00, 2e+06]\n",
      "  Objective range  [1e+00, 6e+07]\n",
      "  Bounds range     [1e+00, 1e+00]\n",
      "  RHS range        [3e+08, 3e+08]\n",
      "\n",
      "User MIP start produced solution with objective 8.25814e+09 (0.01s)\n",
      "User MIP start produced solution with objective 8.25512e+09 (0.01s)\n",
      "User MIP start produced solution with objective 8.25459e+09 (0.04s)\n",
      "User MIP start produced solution with objective 8.25459e+09 (0.04s)\n",
      "Loaded user MIP start with objective 8.25459e+09\n",
      "\n",
      "Presolve time: 0.01s\n",
      "Presolved: 1001 rows, 1000 columns, 2500 nonzeros\n",
      "Variable types: 500 continuous, 500 integer (500 binary)\n",
      "\n",
      "Root relaxation: objective 8.253597e+09, 512 iterations, 0.00 seconds (0.00 work units)\n",
      "\n",
      "    Nodes    |    Current Node    |     Objective Bounds      |     Work\n",
      " Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time\n",
      "\n",
      "     0     0 8.2536e+09    0    1 8.2546e+09 8.2536e+09  0.01%     -    0s\n",
      "     0     0 8.2537e+09    0    3 8.2546e+09 8.2537e+09  0.01%     -    0s\n",
      "     0     0 8.2537e+09    0    1 8.2546e+09 8.2537e+09  0.01%     -    0s\n",
      "     0     0 8.2537e+09    0    4 8.2546e+09 8.2537e+09  0.01%     -    0s\n",
      "     0     0 8.2537e+09    0    4 8.2546e+09 8.2537e+09  0.01%     -    0s\n",
      "     0     0 8.2538e+09    0    4 8.2546e+09 8.2538e+09  0.01%     -    0s\n",
      "     0     0 8.2538e+09    0    5 8.2546e+09 8.2538e+09  0.01%     -    0s\n",
      "     0     0 8.2538e+09    0    6 8.2546e+09 8.2538e+09  0.01%     -    0s\n",
      "\n",
      "Cutting planes:\n",
      "  Cover: 1\n",
      "  MIR: 2\n",
      "  StrongCG: 1\n",
      "  Flow cover: 1\n",
      "\n",
      "Explored 1 nodes (575 simplex iterations) in 0.12 seconds (0.01 work units)\n",
      "Thread count was 12 (of 12 available processors)\n",
      "\n",
      "Solution count 3: 8.25459e+09 8.25512e+09 8.25814e+09 \n",
      "\n",
      "Optimal solution found (tolerance 1.00e-04)\n",
      "Best objective 8.254590409970e+09, best bound 8.253768093811e+09, gap 0.0100%\n",
      "obj = 8254590409.969726\n",
      "x = [1.0, 1.0, 0.0]\n",
      "y = [935662.0949263407, 1604270.0218116897, 0.0]\n"
     ]
    }
   ],
   "source": [
    "data = random_uc_data(samples=1, n=500)[0]\n",
    "model = build_uc_model(data)\n",
    "solver_ml.optimize(model)\n",
    "print(\"obj =\", model.inner.objVal)\n",
    "print(\"x =\", [model.inner._x[i].x for i in range(3)])\n",
    "print(\"y =\", [model.inner._y[i].x for i in range(3)])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5593d23a-83bd-4e16-8253-6300f5e3f63b",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.16"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
 }
--- a/docs/tutorials/getting-started-jump.ipynb
+++ b/docs/tutorials/getting-started-jump.ipynb
@@ -0,0 +1,680 @@
 {
 "cells": [
  {
   "cell_type": "markdown",
   "id": "6b8983b1",
   "metadata": {
    "tags": []
   },
   "source": [
    "# Getting started (JuMP)\n",
    "\n",
    "## Introduction\n",
    "\n",
    "**MIPLearn** is an open source framework that uses machine learning (ML) to accelerate the performance of mixed-integer programming solvers (e.g. Gurobi, CPLEX, XPRESS). In this tutorial, we will:\n",
    "\n",
    "1. Install the Julia/JuMP version of MIPLearn\n",
    "2. Model a simple optimization problem using JuMP\n",
    "3. Generate training data and train the ML models\n",
    "4. Use the ML models together Gurobi to solve new instances\n",
    "\n",
    "<div class=\"alert alert-warning\">\n",
    "Warning\n",
    "    \n",
    "MIPLearn is still in early development stage. If run into any bugs or issues, please submit a bug report in our GitHub repository. Comments, suggestions and pull requests are also very welcome!\n",
    "    \n",
    "</div>\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "02f0a927",
   "metadata": {},
   "source": [
    "## Installation\n",
    "\n",
    "MIPLearn is available in two versions:\n",
    "\n",
    "- Python version, compatible with the Pyomo and Gurobipy modeling languages,\n",
    "- Julia version, compatible with the JuMP modeling language.\n",
    "\n",
    "In this tutorial, we will demonstrate how to use and install the Python/Pyomo version of the package. The first step is to install Julia in your machine. See the [official Julia website for more instructions](https://julialang.org/downloads/). After Julia is installed, launch the Julia REPL, type `]` to enter package mode, then install MIPLearn:\n",
    "\n",
    "```\n",
    "pkg> add MIPLearn@0.3\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e8274543",
   "metadata": {},
   "source": [
    "In addition to MIPLearn itself, we will also install:\n",
    "\n",
    "- the JuMP modeling language\n",
    "- Gurobi, a state-of-the-art commercial MILP solver\n",
    "- Distributions, to generate random data\n",
    "- PyCall, to access ML model from Scikit-Learn\n",
    "- Suppressor, to make the output cleaner\n",
    "\n",
    "```\n",
    "pkg> add JuMP@1, Gurobi@1, Distributions@0.25, PyCall@1, Suppressor@0.2\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a14e4550",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-info\">\n",
    "    \n",
    "Note\n",
    "\n",
    "- If you do not have a Gurobi license available, you can also follow the tutorial by installing an open-source solver, such as `HiGHS`, and replacing `Gurobi.Optimizer` by `HiGHS.Optimizer` in all the code examples.\n",
    "- In the code above, we install specific version of all packages to ensure that this tutorial keeps running in the future, even when newer (and possibly incompatible) versions of the packages are released. This is usually a recommended practice for all Julia projects.\n",
    "   \n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "16b86823",
   "metadata": {},
   "source": [
    "## Modeling a simple optimization problem\n",
    "\n",
    "To illustrate how can MIPLearn be used, we will model and solve a small optimization problem related to power systems optimization. The problem we discuss below is a simplification of the **unit commitment problem,** a practical optimization problem solved daily by electric grid operators around the world. \n",
    "\n",
    "Suppose that a utility company needs to decide which electrical generators should be online at each hour of the day, as well as how much power should each generator produce. More specifically, assume that the company owns $n$ generators, denoted by $g_1, \\ldots, g_n$. Each generator can either be online or offline. An online generator $g_i$ can produce between $p^\\text{min}_i$ to $p^\\text{max}_i$ megawatts of power, and it costs the company $c^\\text{fix}_i + c^\\text{var}_i y_i$, where $y_i$ is the amount of power produced. An offline generator produces nothing and costs nothing. The total amount of power to be produced needs to be exactly equal to the total demand $d$ (in megawatts).\n",
    "\n",
    "This simple problem can be modeled as a *mixed-integer linear optimization* problem as follows. For each generator $g_i$, let $x_i \\in \\{0,1\\}$ be a decision variable indicating whether $g_i$ is online, and let $y_i \\geq 0$ be a decision variable indicating how much power does $g_i$ produce. The problem is then given by:"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f12c3702",
   "metadata": {},
   "source": [
    "$$\n",
    "\\begin{align}\n",
    "\\text{minimize } \\quad & \\sum_{i=1}^n \\left( c^\\text{fix}_i x_i + c^\\text{var}_i y_i \\right) \\\\\n",
    "\\text{subject to } \\quad & y_i \\leq p^\\text{max}_i x_i & i=1,\\ldots,n \\\\\n",
    "& y_i \\geq p^\\text{min}_i x_i & i=1,\\ldots,n \\\\\n",
    "& \\sum_{i=1}^n y_i = d \\\\\n",
    "& x_i \\in \\{0,1\\} & i=1,\\ldots,n \\\\\n",
    "& y_i \\geq 0 & i=1,\\ldots,n\n",
    "\\end{align}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "be3989ed",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-info\">\n",
    "\n",
    "Note\n",
    "\n",
    "We use a simplified version of the unit commitment problem in this tutorial just to make it easier to follow. MIPLearn can also handle realistic, large-scale versions of this problem.\n",
    "\n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a5fd33f6",
   "metadata": {},
   "source": [
    "Next, let us convert this abstract mathematical formulation into a concrete optimization model, using Julia and JuMP. We start by defining a data class `UnitCommitmentData`, which holds all the input data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "c62ebff1-db40-45a1-9997-d121837f067b",
   "metadata": {},
   "outputs": [],
   "source": [
    "struct UnitCommitmentData\n",
    "    demand::Float64\n",
    "    pmin::Vector{Float64}\n",
    "    pmax::Vector{Float64}\n",
    "    cfix::Vector{Float64}\n",
    "    cvar::Vector{Float64}\n",
    "end;"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "29f55efa-0751-465a-9b0a-a821d46a3d40",
   "metadata": {},
   "source": [
    "Next, we write a `build_uc_model` function, which converts the input data into a concrete JuMP model. The function accepts `UnitCommitmentData`, the data structure we previously defined, or the path to a JLD2 file containing this data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "79ef7775-18ca-4dfa-b438-49860f762ad0",
   "metadata": {},
   "outputs": [],
   "source": [
    "using MIPLearn\n",
    "using JuMP\n",
    "using Gurobi\n",
    "\n",
    "function build_uc_model(data)\n",
    "    if data isa String\n",
    "        data = read_jld2(data)\n",
    "    end\n",
    "    model = Model(Gurobi.Optimizer)\n",
    "    G = 1:length(data.pmin)\n",
    "    @variable(model, x[G], Bin)\n",
    "    @variable(model, y[G] >= 0)\n",
    "    @objective(model, Min, sum(data.cfix[g] * x[g] + data.cvar[g] * y[g] for g in G))\n",
    "    @constraint(model, eq_max_power[g in G], y[g] <= data.pmax[g] * x[g])\n",
    "    @constraint(model, eq_min_power[g in G], y[g] >= data.pmin[g] * x[g])\n",
    "    @constraint(model, eq_demand, sum(y[g] for g in G) == data.demand)\n",
    "    return JumpModel(model)\n",
    "end;"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c22714a3",
   "metadata": {},
   "source": [
    "At this point, we can already use Gurobi to find optimal solutions to any instance of this problem. To illustrate this, let us solve a small instance with three generators:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "dd828d68-fd43-4d2a-a058-3e2628d99d9e",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-06T20:01:10.993801745Z",
     "start_time": "2023-06-06T20:01:10.887580927Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Gurobi Optimizer version 10.0.1 build v10.0.1rc0 (linux64)\n",
      "\n",
      "CPU model: AMD Ryzen 9 7950X 16-Core Processor, instruction set [SSE2|AVX|AVX2|AVX512]\n",
      "Thread count: 16 physical cores, 32 logical processors, using up to 32 threads\n",
      "\n",
      "Optimize a model with 7 rows, 6 columns and 15 nonzeros\n",
      "Model fingerprint: 0x55e33a07\n",
      "Variable types: 3 continuous, 3 integer (3 binary)\n",
      "Coefficient statistics:\n",
      "  Matrix range     [1e+00, 7e+01]\n",
      "  Objective range  [2e+00, 7e+02]\n",
      "  Bounds range     [0e+00, 0e+00]\n",
      "  RHS range        [1e+02, 1e+02]\n",
      "Presolve removed 2 rows and 1 columns\n",
      "Presolve time: 0.00s\n",
      "Presolved: 5 rows, 5 columns, 13 nonzeros\n",
      "Variable types: 0 continuous, 5 integer (3 binary)\n",
      "Found heuristic solution: objective 1400.0000000\n",
      "\n",
      "Root relaxation: objective 1.035000e+03, 3 iterations, 0.00 seconds (0.00 work units)\n",
      "\n",
      "    Nodes    |    Current Node    |     Objective Bounds      |     Work\n",
      " Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time\n",
      "\n",
      "     0     0 1035.00000    0    1 1400.00000 1035.00000  26.1%     -    0s\n",
      "     0     0 1105.71429    0    1 1400.00000 1105.71429  21.0%     -    0s\n",
      "*    0     0               0    1320.0000000 1320.00000  0.00%     -    0s\n",
      "\n",
      "Explored 1 nodes (5 simplex iterations) in 0.00 seconds (0.00 work units)\n",
      "Thread count was 32 (of 32 available processors)\n",
      "\n",
      "Solution count 2: 1320 1400 \n",
      "\n",
      "Optimal solution found (tolerance 1.00e-04)\n",
      "Best objective 1.320000000000e+03, best bound 1.320000000000e+03, gap 0.0000%\n",
      "\n",
      "User-callback calls 371, time in user-callback 0.00 sec\n",
      "objective_value(model.inner) = 1320.0\n",
      "Vector(value.(model.inner[:x])) = [-0.0, 1.0, 1.0]\n",
      "Vector(value.(model.inner[:y])) = [0.0, 60.0, 40.0]\n"
     ]
    }
   ],
   "source": [
    "model = build_uc_model(\n",
    "    UnitCommitmentData(\n",
    "        100.0,  # demand\n",
    "        [10, 20, 30],  # pmin\n",
    "        [50, 60, 70],  # pmax\n",
    "        [700, 600, 500],  # cfix\n",
    "        [1.5, 2.0, 2.5],  # cvar\n",
    "    )\n",
    ")\n",
    "model.optimize()\n",
    "@show objective_value(model.inner)\n",
    "@show Vector(value.(model.inner[:x]))\n",
    "@show Vector(value.(model.inner[:y]));"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "41b03bbc",
   "metadata": {},
   "source": [
    "Running the code above, we found that the optimal solution for our small problem instance costs \\$1320. It is achieve by keeping generators 2 and 3 online and producing, respectively, 60 MW and 40 MW of power."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "01f576e1-1790-425e-9e5c-9fa07b6f4c26",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-info\">\n",
    "    \n",
    "Notes\n",
    "    \n",
    "- In the example above, `JumpModel` is just a thin wrapper around a standard JuMP model. This wrapper allows MIPLearn to be solver- and modeling-language-agnostic. The wrapper provides only a few basic methods, such as `optimize`. For more control, and to query the solution, the original JuMP model can be accessed through `model.inner`, as illustrated above.\n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cf60c1dd",
   "metadata": {},
   "source": [
    "## Generating training data\n",
    "\n",
    "Although Gurobi could solve the small example above in a fraction of a second, it gets slower for larger and more complex versions of the problem. If this is a problem that needs to be solved frequently, as it is often the case in practice, it could make sense to spend some time upfront generating a **trained** solver, which can optimize new instances (similar to the ones it was trained on) faster.\n",
    "\n",
    "In the following, we will use MIPLearn to train machine learning models that is able to predict the optimal solution for instances that follow a given probability distribution, then it will provide this predicted solution to Gurobi as a warm start. Before we can train the model, we need to collect training data by solving a large number of instances. In real-world situations, we may construct these training instances based on historical data. In this tutorial, we will construct them using a random instance generator:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "1326efd7-3869-4137-ab6b-df9cb609a7e0",
   "metadata": {},
   "outputs": [],
   "source": [
    "using Distributions\n",
    "using Random\n",
    "\n",
    "function random_uc_data(; samples::Int, n::Int, seed::Int=42)::Vector\n",
    "    Random.seed!(seed)\n",
    "    pmin = rand(Uniform(100_000, 500_000), n)\n",
    "    pmax = pmin .* rand(Uniform(2, 2.5), n)\n",
    "    cfix = pmin .* rand(Uniform(100, 125), n)\n",
    "    cvar = rand(Uniform(1.25, 1.50), n)\n",
    "    return [\n",
    "        UnitCommitmentData(\n",
    "            sum(pmax) * rand(Uniform(0.5, 0.75)),\n",
    "            pmin,\n",
    "            pmax,\n",
    "            cfix,\n",
    "            cvar,\n",
    "        )\n",
    "        for _ in 1:samples\n",
    "    ]\n",
    "end;"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3a03a7ac",
   "metadata": {},
   "source": [
    "In this example, for simplicity, only the demands change from one instance to the next. We could also have randomized the costs, production limits or even the number of units. The more randomization we have in the training data, however, the more challenging it is for the machine learning models to learn solution patterns.\n",
    "\n",
    "Now we generate 500 instances of this problem, each one with 50 generators, and we use 450 of these instances for training. After generating the instances, we write them to individual files. MIPLearn uses files during the training process because, for large-scale optimization problems, it is often impractical to hold in memory the entire training data, as well as the concrete Pyomo models. Files also make it much easier to solve multiple instances simultaneously, potentially on multiple machines. The code below generates the files `uc/train/00001.jld2`, `uc/train/00002.jld2`, etc., which contain the input data in JLD2 format."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "6156752c",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-06T20:03:04.782830561Z",
     "start_time": "2023-06-06T20:03:04.530421396Z"
    }
   },
   "outputs": [],
   "source": [
    "data = random_uc_data(samples=500, n=500)\n",
    "train_data = write_jld2(data[1:450], \"uc/train\")\n",
    "test_data = write_jld2(data[451:500], \"uc/test\");"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b17af877",
   "metadata": {},
   "source": [
    "Finally, we use `BasicCollector` to collect the optimal solutions and other useful training data for all training instances. The data is stored in HDF5 files `uc/train/00001.h5`, `uc/train/00002.h5`, etc. The optimization models are also exported to compressed MPS files `uc/train/00001.mps.gz`, `uc/train/00002.mps.gz`, etc."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "7623f002",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-06T20:03:35.571497019Z",
     "start_time": "2023-06-06T20:03:25.804104036Z"
    }
   },
   "outputs": [],
   "source": [
    "using Suppressor\n",
    "@suppress_out begin\n",
    "    bc = BasicCollector()\n",
    "    bc.collect(train_data, build_uc_model)\n",
    "end"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c42b1be1-9723-4827-82d8-974afa51ef9f",
   "metadata": {},
   "source": [
    "## Training and solving test instances"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a33c6aa4-f0b8-4ccb-9935-01f7d7de2a1c",
   "metadata": {},
   "source": [
    "With training data in hand, we can now design and train a machine learning model to accelerate solver performance. In this tutorial, for illustration purposes, we will use ML to generate a good warm start using $k$-nearest neighbors. More specifically, the strategy is to:\n",
    "\n",
    "1. Memorize the optimal solutions of all training instances;\n",
    "2. Given a test instance, find the 25 most similar training instances, based on constraint right-hand sides;\n",
    "3. Merge their optimal solutions into a single partial solution; specifically, only assign values to the binary variables that agree unanimously.\n",
    "4. Provide this partial solution to the solver as a warm start.\n",
    "\n",
    "This simple strategy can be implemented as shown below, using `MemorizingPrimalComponent`. For more advanced strategies, and for the usage of more advanced classifiers, see the user guide."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "435f7bf8-4b09-4889-b1ec-b7b56e7d8ed2",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-06T20:05:20.497772794Z",
     "start_time": "2023-06-06T20:05:20.484821405Z"
    }
   },
   "outputs": [],
   "source": [
    "# Load kNN classifier from Scikit-Learn\n",
    "using PyCall\n",
    "KNeighborsClassifier = pyimport(\"sklearn.neighbors\").KNeighborsClassifier\n",
    "\n",
    "# Build the MIPLearn component\n",
    "comp = MemorizingPrimalComponent(\n",
    "    clf=KNeighborsClassifier(n_neighbors=25),\n",
    "    extractor=H5FieldsExtractor(\n",
    "        instance_fields=[\"static_constr_rhs\"],\n",
    "    ),\n",
    "    constructor=MergeTopSolutions(25, [0.0, 1.0]),\n",
    "    action=SetWarmStart(),\n",
    ");"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9536e7e4-0b0d-49b0-bebd-4a848f839e94",
   "metadata": {},
   "source": [
    "Having defined the ML strategy, we next construct `LearningSolver`, train the ML component and optimize one of the test instances."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "9d13dd50-3dcf-4673-a757-6f44dcc0dedf",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-06T20:05:22.672002339Z",
     "start_time": "2023-06-06T20:05:21.447466634Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Gurobi Optimizer version 10.0.1 build v10.0.1rc0 (linux64)\n",
      "\n",
      "CPU model: AMD Ryzen 9 7950X 16-Core Processor, instruction set [SSE2|AVX|AVX2|AVX512]\n",
      "Thread count: 16 physical cores, 32 logical processors, using up to 32 threads\n",
      "\n",
      "Optimize a model with 1001 rows, 1000 columns and 2500 nonzeros\n",
      "Model fingerprint: 0xd2378195\n",
      "Variable types: 500 continuous, 500 integer (500 binary)\n",
      "Coefficient statistics:\n",
      "  Matrix range     [1e+00, 1e+06]\n",
      "  Objective range  [1e+00, 6e+07]\n",
      "  Bounds range     [0e+00, 0e+00]\n",
      "  RHS range        [2e+08, 2e+08]\n",
      "\n",
      "User MIP start produced solution with objective 1.02165e+10 (0.00s)\n",
      "Loaded user MIP start with objective 1.02165e+10\n",
      "\n",
      "Presolve time: 0.00s\n",
      "Presolved: 1001 rows, 1000 columns, 2500 nonzeros\n",
      "Variable types: 500 continuous, 500 integer (500 binary)\n",
      "\n",
      "Root relaxation: objective 1.021568e+10, 510 iterations, 0.00 seconds (0.00 work units)\n",
      "\n",
      "    Nodes    |    Current Node    |     Objective Bounds      |     Work\n",
      " Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time\n",
      "\n",
      "     0     0 1.0216e+10    0    1 1.0217e+10 1.0216e+10  0.01%     -    0s\n",
      "\n",
      "Explored 1 nodes (510 simplex iterations) in 0.01 seconds (0.00 work units)\n",
      "Thread count was 32 (of 32 available processors)\n",
      "\n",
      "Solution count 1: 1.02165e+10 \n",
      "\n",
      "Optimal solution found (tolerance 1.00e-04)\n",
      "Best objective 1.021651058978e+10, best bound 1.021567971257e+10, gap 0.0081%\n",
      "\n",
      "User-callback calls 169, time in user-callback 0.00 sec\n"
     ]
    }
   ],
   "source": [
    "solver_ml = LearningSolver(components=[comp])\n",
    "solver_ml.fit(train_data)\n",
    "solver_ml.optimize(test_data[1], build_uc_model);"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "61da6dad-7f56-4edb-aa26-c00eb5f946c0",
   "metadata": {},
   "source": [
    "By examining the solve log above, specifically the line `Loaded user MIP start with objective...`, we can see that MIPLearn was able to construct an initial solution which turned out to be very close to the optimal solution to the problem. Now let us repeat the code above, but a solver which does not apply any ML strategies. Note that our previously-defined component is not provided."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "2ff391ed-e855-4228-aa09-a7641d8c2893",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-06T20:05:46.969575966Z",
     "start_time": "2023-06-06T20:05:46.420803286Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Gurobi Optimizer version 10.0.1 build v10.0.1rc0 (linux64)\n",
      "\n",
      "CPU model: AMD Ryzen 9 7950X 16-Core Processor, instruction set [SSE2|AVX|AVX2|AVX512]\n",
      "Thread count: 16 physical cores, 32 logical processors, using up to 32 threads\n",
      "\n",
      "Optimize a model with 1001 rows, 1000 columns and 2500 nonzeros\n",
      "Model fingerprint: 0xb45c0594\n",
      "Variable types: 500 continuous, 500 integer (500 binary)\n",
      "Coefficient statistics:\n",
      "  Matrix range     [1e+00, 1e+06]\n",
      "  Objective range  [1e+00, 6e+07]\n",
      "  Bounds range     [0e+00, 0e+00]\n",
      "  RHS range        [2e+08, 2e+08]\n",
      "Presolve time: 0.00s\n",
      "Presolved: 1001 rows, 1000 columns, 2500 nonzeros\n",
      "Variable types: 500 continuous, 500 integer (500 binary)\n",
      "Found heuristic solution: objective 1.071463e+10\n",
      "\n",
      "Root relaxation: objective 1.021568e+10, 510 iterations, 0.00 seconds (0.00 work units)\n",
      "\n",
      "    Nodes    |    Current Node    |     Objective Bounds      |     Work\n",
      " Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time\n",
      "\n",
      "     0     0 1.0216e+10    0    1 1.0715e+10 1.0216e+10  4.66%     -    0s\n",
      "H    0     0                    1.025162e+10 1.0216e+10  0.35%     -    0s\n",
      "     0     0 1.0216e+10    0    1 1.0252e+10 1.0216e+10  0.35%     -    0s\n",
      "H    0     0                    1.023090e+10 1.0216e+10  0.15%     -    0s\n",
      "H    0     0                    1.022335e+10 1.0216e+10  0.07%     -    0s\n",
      "H    0     0                    1.022281e+10 1.0216e+10  0.07%     -    0s\n",
      "H    0     0                    1.021753e+10 1.0216e+10  0.02%     -    0s\n",
      "H    0     0                    1.021752e+10 1.0216e+10  0.02%     -    0s\n",
      "     0     0 1.0216e+10    0    3 1.0218e+10 1.0216e+10  0.02%     -    0s\n",
      "     0     0 1.0216e+10    0    1 1.0218e+10 1.0216e+10  0.02%     -    0s\n",
      "H    0     0                    1.021651e+10 1.0216e+10  0.01%     -    0s\n",
      "\n",
      "Explored 1 nodes (764 simplex iterations) in 0.03 seconds (0.02 work units)\n",
      "Thread count was 32 (of 32 available processors)\n",
      "\n",
      "Solution count 7: 1.02165e+10 1.02175e+10 1.02228e+10 ... 1.07146e+10\n",
      "\n",
      "Optimal solution found (tolerance 1.00e-04)\n",
      "Best objective 1.021651058978e+10, best bound 1.021573363741e+10, gap 0.0076%\n",
      "\n",
      "User-callback calls 204, time in user-callback 0.00 sec\n"
     ]
    }
   ],
   "source": [
    "solver_baseline = LearningSolver(components=[])\n",
    "solver_baseline.fit(train_data)\n",
    "solver_baseline.optimize(test_data[1], build_uc_model);"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b6d37b88-9fcc-43ee-ac1e-2a7b1e51a266",
   "metadata": {},
   "source": [
    "In the log above, the `MIP start` line is missing, and Gurobi had to start with a significantly inferior initial solution. The solver was still able to find the optimal solution at the end, but it required using its own internal heuristic procedures. In this example, because we solve very small optimization problems, there was almost no difference in terms of running time, but the difference can be significant for larger problems."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eec97f06",
   "metadata": {
    "tags": []
   },
   "source": [
    "## Accessing the solution\n",
    "\n",
    "In the example above, we used `LearningSolver.solve` together with data files to solve both the training and the test instances. The optimal solutions were saved to HDF5 files in the train/test folders, and could be retrieved by reading theses files, but that is not very convenient. In the following example, we show how to build and solve a JuMP model entirely in-memory, using our trained solver."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "67a6cd18",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-06T20:06:26.913448568Z",
     "start_time": "2023-06-06T20:06:26.169047914Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Gurobi Optimizer version 10.0.1 build v10.0.1rc0 (linux64)\n",
      "\n",
      "CPU model: AMD Ryzen 9 7950X 16-Core Processor, instruction set [SSE2|AVX|AVX2|AVX512]\n",
      "Thread count: 16 physical cores, 32 logical processors, using up to 32 threads\n",
      "\n",
      "Optimize a model with 1001 rows, 1000 columns and 2500 nonzeros\n",
      "Model fingerprint: 0x974a7fba\n",
      "Variable types: 500 continuous, 500 integer (500 binary)\n",
      "Coefficient statistics:\n",
      "  Matrix range     [1e+00, 1e+06]\n",
      "  Objective range  [1e+00, 6e+07]\n",
      "  Bounds range     [0e+00, 0e+00]\n",
      "  RHS range        [2e+08, 2e+08]\n",
      "\n",
      "User MIP start produced solution with objective 9.86729e+09 (0.00s)\n",
      "User MIP start produced solution with objective 9.86675e+09 (0.00s)\n",
      "User MIP start produced solution with objective 9.86654e+09 (0.01s)\n",
      "User MIP start produced solution with objective 9.8661e+09 (0.01s)\n",
      "Loaded user MIP start with objective 9.8661e+09\n",
      "\n",
      "Presolve time: 0.00s\n",
      "Presolved: 1001 rows, 1000 columns, 2500 nonzeros\n",
      "Variable types: 500 continuous, 500 integer (500 binary)\n",
      "\n",
      "Root relaxation: objective 9.865344e+09, 510 iterations, 0.00 seconds (0.00 work units)\n",
      "\n",
      "    Nodes    |    Current Node    |     Objective Bounds      |     Work\n",
      " Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time\n",
      "\n",
      "     0     0 9.8653e+09    0    1 9.8661e+09 9.8653e+09  0.01%     -    0s\n",
      "\n",
      "Explored 1 nodes (510 simplex iterations) in 0.02 seconds (0.01 work units)\n",
      "Thread count was 32 (of 32 available processors)\n",
      "\n",
      "Solution count 4: 9.8661e+09 9.86654e+09 9.86675e+09 9.86729e+09 \n",
      "\n",
      "Optimal solution found (tolerance 1.00e-04)\n",
      "Best objective 9.866096485614e+09, best bound 9.865343669936e+09, gap 0.0076%\n",
      "\n",
      "User-callback calls 182, time in user-callback 0.00 sec\n",
      "objective_value(model.inner) = 9.866096485613789e9\n"
     ]
    }
   ],
   "source": [
    "data = random_uc_data(samples=1, n=500)[1]\n",
    "model = build_uc_model(data)\n",
    "solver_ml.optimize(model)\n",
    "@show objective_value(model.inner);"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Julia 1.9.0",
   "language": "julia",
   "name": "julia-1.9"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "1.9.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
 }
--- a/docs/tutorials/getting-started-pyomo.ipynb
+++ b/docs/tutorials/getting-started-pyomo.ipynb
@@ -0,0 +1,869 @@
 {
 "cells": [
  {
   "cell_type": "markdown",
   "id": "6b8983b1",
   "metadata": {
    "tags": []
   },
   "source": [
    "# Getting started (Pyomo)\n",
    "\n",
    "## Introduction\n",
    "\n",
    "**MIPLearn** is an open source framework that uses machine learning (ML) to accelerate the performance of mixed-integer programming solvers (e.g. Gurobi, CPLEX, XPRESS). In this tutorial, we will:\n",
    "\n",
    "1. Install the Python/Pyomo version of MIPLearn\n",
    "2. Model a simple optimization problem using Pyomo\n",
    "3. Generate training data and train the ML models\n",
    "4. Use the ML models together Gurobi to solve new instances\n",
    "\n",
    "<div class=\"alert alert-info\">\n",
    "Note\n",
    "    \n",
    "The Python/Pyomo version of MIPLearn is currently only compatible with Pyomo persistent solvers (Gurobi, CPLEX and XPRESS). For broader solver compatibility, see the Julia/JuMP version of the package.\n",
    "</div>\n",
    "\n",
    "<div class=\"alert alert-warning\">\n",
    "Warning\n",
    "    \n",
    "MIPLearn is still in early development stage. If run into any bugs or issues, please submit a bug report in our GitHub repository. Comments, suggestions and pull requests are also very welcome!\n",
    "    \n",
    "</div>\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "02f0a927",
   "metadata": {},
   "source": [
    "## Installation\n",
    "\n",
    "MIPLearn is available in two versions:\n",
    "\n",
    "- Python version, compatible with the Pyomo and Gurobipy modeling languages,\n",
    "- Julia version, compatible with the JuMP modeling language.\n",
    "\n",
    "In this tutorial, we will demonstrate how to use and install the Python/Pyomo version of the package. The first step is to install Python 3.8+ in your computer. See the [official Python website for more instructions](https://www.python.org/downloads/). After Python is installed, we proceed to install MIPLearn using `pip`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "cd8a69c1",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-06T19:57:33.202580815Z",
     "start_time": "2023-06-06T19:57:33.198341886Z"
    }
   },
   "outputs": [],
   "source": [
    "# !pip install MIPLearn==0.3.0"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e8274543",
   "metadata": {},
   "source": [
    "In addition to MIPLearn itself, we will also install Gurobi 10.0, a state-of-the-art commercial MILP solver. This step also install a demo license for Gurobi, which should able to solve the small optimization problems in this tutorial. A license is required for solving larger-scale problems."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "dcc8756c",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-06T19:57:35.756831801Z",
     "start_time": "2023-06-06T19:57:33.201767088Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Requirement already satisfied: gurobipy<10.1,>=10 in /home/axavier/Software/anaconda3/envs/miplearn/lib/python3.8/site-packages (10.0.1)\n"
     ]
    }
   ],
   "source": [
    "!pip install 'gurobipy>=10,<10.1'"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a14e4550",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-info\">\n",
    "    \n",
    "Note\n",
    "    \n",
    "In the code above, we install specific version of all packages to ensure that this tutorial keeps running in the future, even when newer (and possibly incompatible) versions of the packages are released. This is usually a recommended practice for all Python projects.\n",
    "    \n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "16b86823",
   "metadata": {},
   "source": [
    "## Modeling a simple optimization problem\n",
    "\n",
    "To illustrate how can MIPLearn be used, we will model and solve a small optimization problem related to power systems optimization. The problem we discuss below is a simplification of the **unit commitment problem,** a practical optimization problem solved daily by electric grid operators around the world. \n",
    "\n",
    "Suppose that a utility company needs to decide which electrical generators should be online at each hour of the day, as well as how much power should each generator produce. More specifically, assume that the company owns $n$ generators, denoted by $g_1, \\ldots, g_n$. Each generator can either be online or offline. An online generator $g_i$ can produce between $p^\\text{min}_i$ to $p^\\text{max}_i$ megawatts of power, and it costs the company $c^\\text{fix}_i + c^\\text{var}_i y_i$, where $y_i$ is the amount of power produced. An offline generator produces nothing and costs nothing. The total amount of power to be produced needs to be exactly equal to the total demand $d$ (in megawatts).\n",
    "\n",
    "This simple problem can be modeled as a *mixed-integer linear optimization* problem as follows. For each generator $g_i$, let $x_i \\in \\{0,1\\}$ be a decision variable indicating whether $g_i$ is online, and let $y_i \\geq 0$ be a decision variable indicating how much power does $g_i$ produce. The problem is then given by:"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f12c3702",
   "metadata": {},
   "source": [
    "$$\n",
    "\\begin{align}\n",
    "\\text{minimize } \\quad & \\sum_{i=1}^n \\left( c^\\text{fix}_i x_i + c^\\text{var}_i y_i \\right) \\\\\n",
    "\\text{subject to } \\quad & y_i \\leq p^\\text{max}_i x_i & i=1,\\ldots,n \\\\\n",
    "& y_i \\geq p^\\text{min}_i x_i & i=1,\\ldots,n \\\\\n",
    "& \\sum_{i=1}^n y_i = d \\\\\n",
    "& x_i \\in \\{0,1\\} & i=1,\\ldots,n \\\\\n",
    "& y_i \\geq 0 & i=1,\\ldots,n\n",
    "\\end{align}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "be3989ed",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-info\">\n",
    "\n",
    "Note\n",
    "\n",
    "We use a simplified version of the unit commitment problem in this tutorial just to make it easier to follow. MIPLearn can also handle realistic, large-scale versions of this problem.\n",
    "\n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a5fd33f6",
   "metadata": {},
   "source": [
    "Next, let us convert this abstract mathematical formulation into a concrete optimization model, using Python and Pyomo. We start by defining a data class `UnitCommitmentData`, which holds all the input data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "22a67170-10b4-43d3-8708-014d91141e73",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-06T20:00:03.278853343Z",
     "start_time": "2023-06-06T20:00:03.123324067Z"
    },
    "tags": []
   },
   "outputs": [],
   "source": [
    "from dataclasses import dataclass\n",
    "from typing import List\n",
    "\n",
    "import numpy as np\n",
    "\n",
    "\n",
    "@dataclass\n",
    "class UnitCommitmentData:\n",
    "    demand: float\n",
    "    pmin: List[float]\n",
    "    pmax: List[float]\n",
    "    cfix: List[float]\n",
    "    cvar: List[float]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "29f55efa-0751-465a-9b0a-a821d46a3d40",
   "metadata": {},
   "source": [
    "Next, we write a `build_uc_model` function, which converts the input data into a concrete Pyomo model. The function accepts `UnitCommitmentData`, the data structure we previously defined, or the path to a compressed pickle file containing this data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "2f67032f-0d74-4317-b45c-19da0ec859e9",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-06T20:00:45.890126754Z",
     "start_time": "2023-06-06T20:00:45.637044282Z"
    }
   },
   "outputs": [],
   "source": [
    "import pyomo.environ as pe\n",
    "from typing import Union\n",
    "from miplearn.io import read_pkl_gz\n",
    "from miplearn.solvers.pyomo import PyomoModel\n",
    "\n",
    "\n",
    "def build_uc_model(data: Union[str, UnitCommitmentData]) -> PyomoModel:\n",
    "    if isinstance(data, str):\n",
    "        data = read_pkl_gz(data)\n",
    "\n",
    "    model = pe.ConcreteModel()\n",
    "    n = len(data.pmin)\n",
    "    model.x = pe.Var(range(n), domain=pe.Binary)\n",
    "    model.y = pe.Var(range(n), domain=pe.NonNegativeReals)\n",
    "    model.obj = pe.Objective(\n",
    "        expr=sum(\n",
    "            data.cfix[i] * model.x[i] + data.cvar[i] * model.y[i] for i in range(n)\n",
    "        )\n",
    "    )\n",
    "    model.eq_max_power = pe.ConstraintList()\n",
    "    model.eq_min_power = pe.ConstraintList()\n",
    "    for i in range(n):\n",
    "        model.eq_max_power.add(model.y[i] <= data.pmax[i] * model.x[i])\n",
    "        model.eq_min_power.add(model.y[i] >= data.pmin[i] * model.x[i])\n",
    "    model.eq_demand = pe.Constraint(\n",
    "        expr=sum(model.y[i] for i in range(n)) == data.demand,\n",
    "    )\n",
    "    return PyomoModel(model, \"gurobi_persistent\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c22714a3",
   "metadata": {},
   "source": [
    "At this point, we can already use Pyomo and any mixed-integer linear programming solver to find optimal solutions to any instance of this problem. To illustrate this, let us solve a small instance with three generators:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "2a896f47",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-06T20:01:10.993801745Z",
     "start_time": "2023-06-06T20:01:10.887580927Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Restricted license - for non-production use only - expires 2024-10-28\n",
      "Set parameter QCPDual to value 1\n",
      "Gurobi Optimizer version 10.0.1 build v10.0.1rc0 (linux64)\n",
      "\n",
      "CPU model: Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz, instruction set [SSE2|AVX|AVX2]\n",
      "Thread count: 6 physical cores, 12 logical processors, using up to 12 threads\n",
      "\n",
      "Optimize a model with 7 rows, 6 columns and 15 nonzeros\n",
      "Model fingerprint: 0x15c7a953\n",
      "Variable types: 3 continuous, 3 integer (3 binary)\n",
      "Coefficient statistics:\n",
      "  Matrix range     [1e+00, 7e+01]\n",
      "  Objective range  [2e+00, 7e+02]\n",
      "  Bounds range     [1e+00, 1e+00]\n",
      "  RHS range        [1e+02, 1e+02]\n",
      "Presolve removed 2 rows and 1 columns\n",
      "Presolve time: 0.00s\n",
      "Presolved: 5 rows, 5 columns, 13 nonzeros\n",
      "Variable types: 0 continuous, 5 integer (3 binary)\n",
      "Found heuristic solution: objective 1400.0000000\n",
      "\n",
      "Root relaxation: objective 1.035000e+03, 3 iterations, 0.00 seconds (0.00 work units)\n",
      "\n",
      "    Nodes    |    Current Node    |     Objective Bounds      |     Work\n",
      " Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time\n",
      "\n",
      "     0     0 1035.00000    0    1 1400.00000 1035.00000  26.1%     -    0s\n",
      "     0     0 1105.71429    0    1 1400.00000 1105.71429  21.0%     -    0s\n",
      "*    0     0               0    1320.0000000 1320.00000  0.00%     -    0s\n",
      "\n",
      "Explored 1 nodes (5 simplex iterations) in 0.01 seconds (0.00 work units)\n",
      "Thread count was 12 (of 12 available processors)\n",
      "\n",
      "Solution count 2: 1320 1400 \n",
      "\n",
      "Optimal solution found (tolerance 1.00e-04)\n",
      "Best objective 1.320000000000e+03, best bound 1.320000000000e+03, gap 0.0000%\n",
      "WARNING: Cannot get reduced costs for MIP.\n",
      "WARNING: Cannot get duals for MIP.\n",
      "obj = 1320.0\n",
      "x = [-0.0, 1.0, 1.0]\n",
      "y = [0.0, 60.0, 40.0]\n"
     ]
    }
   ],
   "source": [
    "model = build_uc_model(\n",
    "    UnitCommitmentData(\n",
    "        demand=100.0,\n",
    "        pmin=[10, 20, 30],\n",
    "        pmax=[50, 60, 70],\n",
    "        cfix=[700, 600, 500],\n",
    "        cvar=[1.5, 2.0, 2.5],\n",
    "    )\n",
    ")\n",
    "\n",
    "model.optimize()\n",
    "print(\"obj =\", model.inner.obj())\n",
    "print(\"x =\", [model.inner.x[i].value for i in range(3)])\n",
    "print(\"y =\", [model.inner.y[i].value for i in range(3)])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "41b03bbc",
   "metadata": {},
   "source": [
    "Running the code above, we found that the optimal solution for our small problem instance costs \\$1320. It is achieve by keeping generators 2 and 3 online and producing, respectively, 60 MW and 40 MW of power."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "01f576e1-1790-425e-9e5c-9fa07b6f4c26",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-info\">\n",
    "    \n",
    "Notes\n",
    "    \n",
    "- In the example above, `PyomoModel` is just a thin wrapper around a standard Pyomo model. This wrapper allows MIPLearn to be solver- and modeling-language-agnostic. The wrapper provides only a few basic methods, such as `optimize`. For more control, and to query the solution, the original Pyomo model can be accessed through `model.inner`, as illustrated above.    \n",
    "- To use CPLEX or XPRESS, instead of Gurobi, replace `gurobi_persistent` by `cplex_persistent` or `xpress_persistent` in the `build_uc_model`. Note that only persistent Pyomo solvers are currently supported. Pull requests adding support for other types of solver are very welcome.\n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cf60c1dd",
   "metadata": {},
   "source": [
    "## Generating training data\n",
    "\n",
    "Although Gurobi could solve the small example above in a fraction of a second, it gets slower for larger and more complex versions of the problem. If this is a problem that needs to be solved frequently, as it is often the case in practice, it could make sense to spend some time upfront generating a **trained** solver, which can optimize new instances (similar to the ones it was trained on) faster.\n",
    "\n",
    "In the following, we will use MIPLearn to train machine learning models that is able to predict the optimal solution for instances that follow a given probability distribution, then it will provide this predicted solution to Gurobi as a warm start. Before we can train the model, we need to collect training data by solving a large number of instances. In real-world situations, we may construct these training instances based on historical data. In this tutorial, we will construct them using a random instance generator:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "5eb09fab",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-06T20:02:27.324208900Z",
     "start_time": "2023-06-06T20:02:26.990044230Z"
    }
   },
   "outputs": [],
   "source": [
    "from scipy.stats import uniform\n",
    "from typing import List\n",
    "import random\n",
    "\n",
    "\n",
    "def random_uc_data(samples: int, n: int, seed: int = 42) -> List[UnitCommitmentData]:\n",
    "    random.seed(seed)\n",
    "    np.random.seed(seed)\n",
    "    pmin = uniform(loc=100_000.0, scale=400_000.0).rvs(n)\n",
    "    pmax = pmin * uniform(loc=2.0, scale=2.5).rvs(n)\n",
    "    cfix = pmin * uniform(loc=100.0, scale=25.0).rvs(n)\n",
    "    cvar = uniform(loc=1.25, scale=0.25).rvs(n)\n",
    "    return [\n",
    "        UnitCommitmentData(\n",
    "            demand=pmax.sum() * uniform(loc=0.5, scale=0.25).rvs(),\n",
    "            pmin=pmin,\n",
    "            pmax=pmax,\n",
    "            cfix=cfix,\n",
    "            cvar=cvar,\n",
    "        )\n",
    "        for _ in range(samples)\n",
    "    ]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3a03a7ac",
   "metadata": {},
   "source": [
    "In this example, for simplicity, only the demands change from one instance to the next. We could also have randomized the costs, production limits or even the number of units. The more randomization we have in the training data, however, the more challenging it is for the machine learning models to learn solution patterns.\n",
    "\n",
    "Now we generate 500 instances of this problem, each one with 50 generators, and we use 450 of these instances for training. After generating the instances, we write them to individual files. MIPLearn uses files during the training process because, for large-scale optimization problems, it is often impractical to hold in memory the entire training data, as well as the concrete Pyomo models. Files also make it much easier to solve multiple instances simultaneously, potentially on multiple machines. The code below generates the files `uc/train/00000.pkl.gz`, `uc/train/00001.pkl.gz`, etc., which contain the input data in compressed (gzipped) pickle format."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "6156752c",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-06T20:03:04.782830561Z",
     "start_time": "2023-06-06T20:03:04.530421396Z"
    }
   },
   "outputs": [],
   "source": [
    "from miplearn.io import write_pkl_gz\n",
    "\n",
    "data = random_uc_data(samples=500, n=500)\n",
    "train_data = write_pkl_gz(data[0:450], \"uc/train\")\n",
    "test_data = write_pkl_gz(data[450:500], \"uc/test\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b17af877",
   "metadata": {},
   "source": [
    "Finally, we use `BasicCollector` to collect the optimal solutions and other useful training data for all training instances. The data is stored in HDF5 files `uc/train/00000.h5`, `uc/train/00001.h5`, etc. The optimization models are also exported to compressed MPS files `uc/train/00000.mps.gz`, `uc/train/00001.mps.gz`, etc."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "7623f002",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-06T20:03:35.571497019Z",
     "start_time": "2023-06-06T20:03:25.804104036Z"
    }
   },
   "outputs": [],
   "source": [
    "from miplearn.collectors.basic import BasicCollector\n",
    "\n",
    "bc = BasicCollector()\n",
    "bc.collect(train_data, build_uc_model, n_jobs=4)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c42b1be1-9723-4827-82d8-974afa51ef9f",
   "metadata": {},
   "source": [
    "## Training and solving test instances"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a33c6aa4-f0b8-4ccb-9935-01f7d7de2a1c",
   "metadata": {},
   "source": [
    "With training data in hand, we can now design and train a machine learning model to accelerate solver performance. In this tutorial, for illustration purposes, we will use ML to generate a good warm start using $k$-nearest neighbors. More specifically, the strategy is to:\n",
    "\n",
    "1. Memorize the optimal solutions of all training instances;\n",
    "2. Given a test instance, find the 25 most similar training instances, based on constraint right-hand sides;\n",
    "3. Merge their optimal solutions into a single partial solution; specifically, only assign values to the binary variables that agree unanimously.\n",
    "4. Provide this partial solution to the solver as a warm start.\n",
    "\n",
    "This simple strategy can be implemented as shown below, using `MemorizingPrimalComponent`. For more advanced strategies, and for the usage of more advanced classifiers, see the user guide."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "435f7bf8-4b09-4889-b1ec-b7b56e7d8ed2",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-06T20:05:20.497772794Z",
     "start_time": "2023-06-06T20:05:20.484821405Z"
    }
   },
   "outputs": [],
   "source": [
    "from sklearn.neighbors import KNeighborsClassifier\n",
    "from miplearn.components.primal.actions import SetWarmStart\n",
    "from miplearn.components.primal.mem import (\n",
    "    MemorizingPrimalComponent,\n",
    "    MergeTopSolutions,\n",
    ")\n",
    "from miplearn.extractors.fields import H5FieldsExtractor\n",
    "\n",
    "comp = MemorizingPrimalComponent(\n",
    "    clf=KNeighborsClassifier(n_neighbors=25),\n",
    "    extractor=H5FieldsExtractor(\n",
    "        instance_fields=[\"static_constr_rhs\"],\n",
    "    ),\n",
    "    constructor=MergeTopSolutions(25, [0.0, 1.0]),\n",
    "    action=SetWarmStart(),\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9536e7e4-0b0d-49b0-bebd-4a848f839e94",
   "metadata": {},
   "source": [
    "Having defined the ML strategy, we next construct `LearningSolver`, train the ML component and optimize one of the test instances."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "9d13dd50-3dcf-4673-a757-6f44dcc0dedf",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-06T20:05:22.672002339Z",
     "start_time": "2023-06-06T20:05:21.447466634Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Set parameter QCPDual to value 1\n",
      "Gurobi Optimizer version 10.0.1 build v10.0.1rc0 (linux64)\n",
      "\n",
      "CPU model: Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz, instruction set [SSE2|AVX|AVX2]\n",
      "Thread count: 6 physical cores, 12 logical processors, using up to 12 threads\n",
      "\n",
      "Optimize a model with 1001 rows, 1000 columns and 2500 nonzeros\n",
      "Model fingerprint: 0x5e67c6ee\n",
      "Coefficient statistics:\n",
      "  Matrix range     [1e+00, 2e+06]\n",
      "  Objective range  [1e+00, 6e+07]\n",
      "  Bounds range     [1e+00, 1e+00]\n",
      "  RHS range        [3e+08, 3e+08]\n",
      "Presolve removed 1000 rows and 500 columns\n",
      "Presolve time: 0.00s\n",
      "Presolved: 1 rows, 500 columns, 500 nonzeros\n",
      "\n",
      "Iteration    Objective       Primal Inf.    Dual Inf.      Time\n",
      "       0    6.6166537e+09   5.648803e+04   0.000000e+00      0s\n",
      "       1    8.2906219e+09   0.000000e+00   0.000000e+00      0s\n",
      "\n",
      "Solved in 1 iterations and 0.01 seconds (0.00 work units)\n",
      "Optimal objective  8.290621916e+09\n",
      "Set parameter QCPDual to value 1\n",
      "Gurobi Optimizer version 10.0.1 build v10.0.1rc0 (linux64)\n",
      "\n",
      "CPU model: Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz, instruction set [SSE2|AVX|AVX2]\n",
      "Thread count: 6 physical cores, 12 logical processors, using up to 12 threads\n",
      "\n",
      "Optimize a model with 1001 rows, 1000 columns and 2500 nonzeros\n",
      "Model fingerprint: 0xa4a7961e\n",
      "Variable types: 500 continuous, 500 integer (500 binary)\n",
      "Coefficient statistics:\n",
      "  Matrix range     [1e+00, 2e+06]\n",
      "  Objective range  [1e+00, 6e+07]\n",
      "  Bounds range     [1e+00, 1e+00]\n",
      "  RHS range        [3e+08, 3e+08]\n",
      "\n",
      "User MIP start produced solution with objective 8.30129e+09 (0.01s)\n",
      "User MIP start produced solution with objective 8.29184e+09 (0.01s)\n",
      "User MIP start produced solution with objective 8.29146e+09 (0.01s)\n",
      "User MIP start produced solution with objective 8.29146e+09 (0.02s)\n",
      "Loaded user MIP start with objective 8.29146e+09\n",
      "\n",
      "Presolve time: 0.01s\n",
      "Presolved: 1001 rows, 1000 columns, 2500 nonzeros\n",
      "Variable types: 500 continuous, 500 integer (500 binary)\n",
      "\n",
      "Root relaxation: objective 8.290622e+09, 512 iterations, 0.01 seconds (0.00 work units)\n",
      "\n",
      "    Nodes    |    Current Node    |     Objective Bounds      |     Work\n",
      " Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time\n",
      "\n",
      "     0     0 8.2906e+09    0    1 8.2915e+09 8.2906e+09  0.01%     -    0s\n",
      "\n",
      "Cutting planes:\n",
      "  Cover: 1\n",
      "  Flow cover: 2\n",
      "\n",
      "Explored 1 nodes (512 simplex iterations) in 0.09 seconds (0.01 work units)\n",
      "Thread count was 12 (of 12 available processors)\n",
      "\n",
      "Solution count 3: 8.29146e+09 8.29184e+09 8.30129e+09 \n",
      "\n",
      "Optimal solution found (tolerance 1.00e-04)\n",
      "Best objective 8.291459497797e+09, best bound 8.290645029670e+09, gap 0.0098%\n",
      "WARNING: Cannot get reduced costs for MIP.\n",
      "WARNING: Cannot get duals for MIP.\n"
     ]
    }
   ],
   "source": [
    "from miplearn.solvers.learning import LearningSolver\n",
    "\n",
    "solver_ml = LearningSolver(components=[comp])\n",
    "solver_ml.fit(train_data)\n",
    "solver_ml.optimize(test_data[0], build_uc_model);"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "61da6dad-7f56-4edb-aa26-c00eb5f946c0",
   "metadata": {},
   "source": [
    "By examining the solve log above, specifically the line `Loaded user MIP start with objective...`, we can see that MIPLearn was able to construct an initial solution which turned out to be very close to the optimal solution to the problem. Now let us repeat the code above, but a solver which does not apply any ML strategies. Note that our previously-defined component is not provided."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "2ff391ed-e855-4228-aa09-a7641d8c2893",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-06T20:05:46.969575966Z",
     "start_time": "2023-06-06T20:05:46.420803286Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Set parameter QCPDual to value 1\n",
      "Gurobi Optimizer version 10.0.1 build v10.0.1rc0 (linux64)\n",
      "\n",
      "CPU model: Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz, instruction set [SSE2|AVX|AVX2]\n",
      "Thread count: 6 physical cores, 12 logical processors, using up to 12 threads\n",
      "\n",
      "Optimize a model with 1001 rows, 1000 columns and 2500 nonzeros\n",
      "Model fingerprint: 0x5e67c6ee\n",
      "Coefficient statistics:\n",
      "  Matrix range     [1e+00, 2e+06]\n",
      "  Objective range  [1e+00, 6e+07]\n",
      "  Bounds range     [1e+00, 1e+00]\n",
      "  RHS range        [3e+08, 3e+08]\n",
      "Presolve removed 1000 rows and 500 columns\n",
      "Presolve time: 0.01s\n",
      "Presolved: 1 rows, 500 columns, 500 nonzeros\n",
      "\n",
      "Iteration    Objective       Primal Inf.    Dual Inf.      Time\n",
      "       0    6.6166537e+09   5.648803e+04   0.000000e+00      0s\n",
      "       1    8.2906219e+09   0.000000e+00   0.000000e+00      0s\n",
      "\n",
      "Solved in 1 iterations and 0.01 seconds (0.00 work units)\n",
      "Optimal objective  8.290621916e+09\n",
      "Set parameter QCPDual to value 1\n",
      "Gurobi Optimizer version 10.0.1 build v10.0.1rc0 (linux64)\n",
      "\n",
      "CPU model: Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz, instruction set [SSE2|AVX|AVX2]\n",
      "Thread count: 6 physical cores, 12 logical processors, using up to 12 threads\n",
      "\n",
      "Optimize a model with 1001 rows, 1000 columns and 2500 nonzeros\n",
      "Model fingerprint: 0x8a0f9587\n",
      "Variable types: 500 continuous, 500 integer (500 binary)\n",
      "Coefficient statistics:\n",
      "  Matrix range     [1e+00, 2e+06]\n",
      "  Objective range  [1e+00, 6e+07]\n",
      "  Bounds range     [1e+00, 1e+00]\n",
      "  RHS range        [3e+08, 3e+08]\n",
      "Presolve time: 0.00s\n",
      "Presolved: 1001 rows, 1000 columns, 2500 nonzeros\n",
      "Variable types: 500 continuous, 500 integer (500 binary)\n",
      "Found heuristic solution: objective 9.757128e+09\n",
      "\n",
      "Root relaxation: objective 8.290622e+09, 512 iterations, 0.00 seconds (0.00 work units)\n",
      "\n",
      "    Nodes    |    Current Node    |     Objective Bounds      |     Work\n",
      " Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time\n",
      "\n",
      "     0     0 8.2906e+09    0    1 9.7571e+09 8.2906e+09  15.0%     -    0s\n",
      "H    0     0                    8.298273e+09 8.2906e+09  0.09%     -    0s\n",
      "     0     0 8.2907e+09    0    4 8.2983e+09 8.2907e+09  0.09%     -    0s\n",
      "     0     0 8.2907e+09    0    1 8.2983e+09 8.2907e+09  0.09%     -    0s\n",
      "     0     0 8.2907e+09    0    4 8.2983e+09 8.2907e+09  0.09%     -    0s\n",
      "H    0     0                    8.293980e+09 8.2907e+09  0.04%     -    0s\n",
      "     0     0 8.2907e+09    0    5 8.2940e+09 8.2907e+09  0.04%     -    0s\n",
      "     0     0 8.2907e+09    0    1 8.2940e+09 8.2907e+09  0.04%     -    0s\n",
      "     0     0 8.2907e+09    0    2 8.2940e+09 8.2907e+09  0.04%     -    0s\n",
      "     0     0 8.2908e+09    0    1 8.2940e+09 8.2908e+09  0.04%     -    0s\n",
      "     0     0 8.2908e+09    0    4 8.2940e+09 8.2908e+09  0.04%     -    0s\n",
      "     0     0 8.2908e+09    0    4 8.2940e+09 8.2908e+09  0.04%     -    0s\n",
      "H    0     0                    8.291465e+09 8.2908e+09  0.01%     -    0s\n",
      "\n",
      "Cutting planes:\n",
      "  Gomory: 2\n",
      "  MIR: 1\n",
      "\n",
      "Explored 1 nodes (1025 simplex iterations) in 0.08 seconds (0.03 work units)\n",
      "Thread count was 12 (of 12 available processors)\n",
      "\n",
      "Solution count 4: 8.29147e+09 8.29398e+09 8.29827e+09 9.75713e+09 \n",
      "\n",
      "Optimal solution found (tolerance 1.00e-04)\n",
      "Best objective 8.291465302389e+09, best bound 8.290781665333e+09, gap 0.0082%\n",
      "WARNING: Cannot get reduced costs for MIP.\n",
      "WARNING: Cannot get duals for MIP.\n"
     ]
    }
   ],
   "source": [
    "solver_baseline = LearningSolver(components=[])\n",
    "solver_baseline.fit(train_data)\n",
    "solver_baseline.optimize(test_data[0], build_uc_model);"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b6d37b88-9fcc-43ee-ac1e-2a7b1e51a266",
   "metadata": {},
   "source": [
    "In the log above, the `MIP start` line is missing, and Gurobi had to start with a significantly inferior initial solution. The solver was still able to find the optimal solution at the end, but it required using its own internal heuristic procedures. In this example, because we solve very small optimization problems, there was almost no difference in terms of running time, but the difference can be significant for larger problems."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eec97f06",
   "metadata": {
    "tags": []
   },
   "source": [
    "## Accessing the solution\n",
    "\n",
    "In the example above, we used `LearningSolver.solve` together with data files to solve both the training and the test instances. The optimal solutions were saved to HDF5 files in the train/test folders, and could be retrieved by reading theses files, but that is not very convenient. In the following example, we show how to build and solve a Pyomo model entirely in-memory, using our trained solver."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "67a6cd18",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2023-06-06T20:06:26.913448568Z",
     "start_time": "2023-06-06T20:06:26.169047914Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Set parameter QCPDual to value 1\n",
      "Gurobi Optimizer version 10.0.1 build v10.0.1rc0 (linux64)\n",
      "\n",
      "CPU model: Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz, instruction set [SSE2|AVX|AVX2]\n",
      "Thread count: 6 physical cores, 12 logical processors, using up to 12 threads\n",
      "\n",
      "Optimize a model with 1001 rows, 1000 columns and 2500 nonzeros\n",
      "Model fingerprint: 0x2dfe4e1c\n",
      "Coefficient statistics:\n",
      "  Matrix range     [1e+00, 2e+06]\n",
      "  Objective range  [1e+00, 6e+07]\n",
      "  Bounds range     [1e+00, 1e+00]\n",
      "  RHS range        [3e+08, 3e+08]\n",
      "Presolve removed 1000 rows and 500 columns\n",
      "Presolve time: 0.01s\n",
      "Presolved: 1 rows, 500 columns, 500 nonzeros\n",
      "\n",
      "Iteration    Objective       Primal Inf.    Dual Inf.      Time\n",
      "       0    6.5917580e+09   5.627453e+04   0.000000e+00      0s\n",
      "       1    8.2535968e+09   0.000000e+00   0.000000e+00      0s\n",
      "\n",
      "Solved in 1 iterations and 0.01 seconds (0.00 work units)\n",
      "Optimal objective  8.253596777e+09\n",
      "Set parameter QCPDual to value 1\n",
      "Gurobi Optimizer version 10.0.1 build v10.0.1rc0 (linux64)\n",
      "\n",
      "CPU model: Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz, instruction set [SSE2|AVX|AVX2]\n",
      "Thread count: 6 physical cores, 12 logical processors, using up to 12 threads\n",
      "\n",
      "Optimize a model with 1001 rows, 1000 columns and 2500 nonzeros\n",
      "Model fingerprint: 0x20637200\n",
      "Variable types: 500 continuous, 500 integer (500 binary)\n",
      "Coefficient statistics:\n",
      "  Matrix range     [1e+00, 2e+06]\n",
      "  Objective range  [1e+00, 6e+07]\n",
      "  Bounds range     [1e+00, 1e+00]\n",
      "  RHS range        [3e+08, 3e+08]\n",
      "\n",
      "User MIP start produced solution with objective 8.25814e+09 (0.01s)\n",
      "User MIP start produced solution with objective 8.25512e+09 (0.01s)\n",
      "User MIP start produced solution with objective 8.25459e+09 (0.04s)\n",
      "User MIP start produced solution with objective 8.25459e+09 (0.04s)\n",
      "Loaded user MIP start with objective 8.25459e+09\n",
      "\n",
      "Presolve time: 0.01s\n",
      "Presolved: 1001 rows, 1000 columns, 2500 nonzeros\n",
      "Variable types: 500 continuous, 500 integer (500 binary)\n",
      "\n",
      "Root relaxation: objective 8.253597e+09, 512 iterations, 0.00 seconds (0.00 work units)\n",
      "\n",
      "    Nodes    |    Current Node    |     Objective Bounds      |     Work\n",
      " Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time\n",
      "\n",
      "     0     0 8.2536e+09    0    1 8.2546e+09 8.2536e+09  0.01%     -    0s\n",
      "     0     0 8.2537e+09    0    3 8.2546e+09 8.2537e+09  0.01%     -    0s\n",
      "     0     0 8.2537e+09    0    1 8.2546e+09 8.2537e+09  0.01%     -    0s\n",
      "     0     0 8.2537e+09    0    4 8.2546e+09 8.2537e+09  0.01%     -    0s\n",
      "     0     0 8.2537e+09    0    4 8.2546e+09 8.2537e+09  0.01%     -    0s\n",
      "     0     0 8.2538e+09    0    4 8.2546e+09 8.2538e+09  0.01%     -    0s\n",
      "     0     0 8.2538e+09    0    5 8.2546e+09 8.2538e+09  0.01%     -    0s\n",
      "     0     0 8.2538e+09    0    6 8.2546e+09 8.2538e+09  0.01%     -    0s\n",
      "\n",
      "Cutting planes:\n",
      "  Cover: 1\n",
      "  MIR: 2\n",
      "  StrongCG: 1\n",
      "  Flow cover: 1\n",
      "\n",
      "Explored 1 nodes (575 simplex iterations) in 0.11 seconds (0.01 work units)\n",
      "Thread count was 12 (of 12 available processors)\n",
      "\n",
      "Solution count 3: 8.25459e+09 8.25512e+09 8.25814e+09 \n",
      "\n",
      "Optimal solution found (tolerance 1.00e-04)\n",
      "Best objective 8.254590409970e+09, best bound 8.253768093811e+09, gap 0.0100%\n",
      "WARNING: Cannot get reduced costs for MIP.\n",
      "WARNING: Cannot get duals for MIP.\n",
      "obj = 8254590409.96973\n",
      " x = [1.0, 1.0, 0.0, 1.0, 1.0]\n",
      " y = [935662.0949263407, 1604270.0218116897, 0.0, 1369560.835229226, 602828.5321028307]\n"
     ]
    }
   ],
   "source": [
    "data = random_uc_data(samples=1, n=500)[0]\n",
    "model = build_uc_model(data)\n",
    "solver_ml.optimize(model)\n",
    "print(\"obj =\", model.inner.obj())\n",
    "print(\" x =\", [model.inner.x[i].value for i in range(5)])\n",
    "print(\" y =\", [model.inner.y[i].value for i in range(5)])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5593d23a-83bd-4e16-8253-6300f5e3f63b",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
 }
--- a/miplearn/init.py
+++ b/miplearn/init.py
@@ -1,31 +1,3 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
-#  Copyright (C) 2020-2021, UChicago Argonne, LLC. All rights reserved.
+#  Copyright (C) 2020-2022, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 from .benchmark import BenchmarkRunner
 from .classifiers import Classifier, Regressor
 from .classifiers.adaptive import AdaptiveClassifier
 from .classifiers.sklearn import ScikitLearnRegressor, ScikitLearnClassifier
 from .classifiers.threshold import MinPrecisionThreshold
 from .components.component import Component
 from .components.dynamic_lazy import DynamicLazyConstraintsComponent
 from .components.dynamic_user_cuts import UserCutsComponent
 from .components.objective import ObjectiveValueComponent
 from .components.primal import PrimalSolutionComponent
 from .components.static_lazy import StaticLazyConstraintsComponent
 from .instance.base import Instance
 from .instance.picklegz import (
    PickleGzInstance,
    write_pickle_gz,
    read_pickle_gz,
    write_pickle_gz_multiple,
    save,
    load,
 )
 from .log import setup_logger
 from .solvers.gurobi import GurobiSolver
 from .solvers.internal import InternalSolver
 from .solvers.learning import LearningSolver
 from .solvers.pyomo.base import BasePyomoSolver
 from .solvers.pyomo.cplex import CplexPyomoSolver
 from .solvers.pyomo.gurobi import GurobiPyomoSolver
--- a/miplearn/benchmark.py
+++ b/miplearn/benchmark.py
@@ -1,264 +0,0 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2021, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 import logging
 import os
 from typing import Dict, List, Any, Optional, Callable
 import pandas as pd
 from miplearn.components.component import Component
 from miplearn.instance.base import Instance
 from miplearn.solvers.learning import LearningSolver, FileInstanceWrapper
 from miplearn.solvers.pyomo.gurobi import GurobiPyomoSolver
 from sklearn.utils._testing import ignore_warnings
 from sklearn.exceptions import ConvergenceWarning
 logger = logging.getLogger(__name__)
 class BenchmarkRunner:
    """
    Utility class that simplifies the task of comparing the performance of different
    solvers.
    Parameters
    ----------
    solvers: Dict[str, LearningSolver]
        Dictionary containing the solvers to compare. Solvers may have different
        arguments and components. The key should be the name of the solver. It
        appears in the exported tables of results.
    """
    def __init__(self, solvers: Dict[str, LearningSolver]) -> None:
        self.solvers: Dict[str, LearningSolver] = solvers
        self.results = pd.DataFrame(
            columns=[
                "Solver",
                "Instance",
            ]
        )
    def parallel_solve(
        self,
        filenames: List[str],
        build_model: Callable,
        n_jobs: int = 1,
        n_trials: int = 1,
        progress: bool = False,
    ) -> None:
        self._silence_miplearn_logger()
        trials = filenames * n_trials
        for (solver_name, solver) in self.solvers.items():
            results = solver.parallel_solve(
                trials,
                build_model,
                n_jobs=n_jobs,
                label="benchmark (%s)" % solver_name,
                progress=progress,
            )
            for i in range(len(trials)):
                idx = i % len(filenames)
                results[i]["Solver"] = solver_name
                results[i]["Instance"] = idx
                self.results = self.results.append(pd.DataFrame([results[i]]))
        self._restore_miplearn_logger()
    def write_csv(self, filename: str) -> None:
        """
        Writes the collected results to a CSV file.
        Parameters
        ----------
        filename: str
            The name of the file.
        """
        os.makedirs(os.path.dirname(filename), exist_ok=True)
        self.results.to_csv(filename)
    def fit(
        self,
        filenames: List[str],
        build_model: Callable,
        progress: bool = False,
        n_jobs: int = 1,
    ) -> None:
        components = []
        instances: List[Instance] = [
            FileInstanceWrapper(f, build_model, mode="r") for f in filenames
        ]
        for (solver_name, solver) in self.solvers.items():
            if solver_name == "baseline":
                continue
            components += solver.components.values()
        Component.fit_multiple(
            components,
            instances,
            n_jobs=n_jobs,
            progress=progress,
        )
    def _silence_miplearn_logger(self) -> None:
        miplearn_logger = logging.getLogger("miplearn")
        self.prev_log_level = miplearn_logger.getEffectiveLevel()
        miplearn_logger.setLevel(logging.WARNING)
    def _restore_miplearn_logger(self) -> None:
        miplearn_logger = logging.getLogger("miplearn")
        miplearn_logger.setLevel(self.prev_log_level)
    def write_svg(
        self,
        output: Optional[str] = None,
    ) -> None:
        import matplotlib.pyplot as plt
        import pandas as pd
        import seaborn as sns
        sns.set_style("whitegrid")
        sns.set_palette("Blues_r")
        groups = self.results.groupby("Instance")
        best_lower_bound = groups["mip_lower_bound"].transform("max")
        best_upper_bound = groups["mip_upper_bound"].transform("min")
        self.results["Relative lower bound"] = self.results["mip_lower_bound"] / best_lower_bound
        self.results["Relative upper bound"] = self.results["mip_upper_bound"] / best_upper_bound
        if (self.results["mip_sense"] == "min").any():
            primal_column = "Relative upper bound"
            obj_column = "mip_upper_bound"
            predicted_obj_column = "Objective: Predicted upper bound"
        else:
            primal_column = "Relative lower bound"
            obj_column = "mip_lower_bound"
            predicted_obj_column = "Objective: Predicted lower bound"
        palette = {
            "baseline": "#9b59b6",
            "ml-exact": "#3498db",
            "ml-heuristic": "#95a5a6",
        }
        fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(
            nrows=2,
            ncols=2,
            figsize=(8, 8),
        )
        # Wallclock time
        sns.stripplot(
            x="Solver",
            y="mip_wallclock_time",
            data=self.results,
            ax=ax1,
            jitter=0.25,
            palette=palette,
            size=2.0,
        )
        sns.barplot(
            x="Solver",
            y="mip_wallclock_time",
            data=self.results,
            ax=ax1,
            errwidth=0.0,
            alpha=0.4,
            palette=palette,
        )
        ax1.set(ylabel="Wallclock time (s)")
        # Gap
        sns.stripplot(
            x="Solver",
            y="Gap",
            jitter=0.25,
            data=self.results[self.results["Solver"] != "ml-heuristic"],
            ax=ax2,
            palette=palette,
            size=2.0,
        )
        ax2.set(ylabel="Relative MIP gap")
        # Relative primal bound
        sns.stripplot(
            x="Solver",
            y=primal_column,
            jitter=0.25,
            data=self.results[self.results["Solver"] == "ml-heuristic"],
            ax=ax3,
            palette=palette,
            size=2.0,
        )
        sns.scatterplot(
            x=obj_column,
            y=predicted_obj_column,
            hue="Solver",
            data=self.results[self.results["Solver"] == "ml-exact"],
            ax=ax4,
            palette=palette,
            size=2.0,
        )
        # Predicted vs actual primal bound
        xlim, ylim = ax4.get_xlim(), ax4.get_ylim()
        ax4.plot(
            [-1e10, 1e10],
            [-1e10, 1e10],
            ls="-",
            color="#cccccc",
        )
        ax4.set_xlim(xlim)
        ax4.set_ylim(xlim)
        ax4.get_legend().remove()
        ax4.set(
            ylabel="Predicted optimal value",
            xlabel="Actual optimal value",
        )
        fig.tight_layout()
        plt.savefig(output)
@ignore_warnings(category=ConvergenceWarning)
 def run_benchmarks(
    train_instances: List[Instance],
    test_instances: List[Instance],
    n_jobs: int = 4,
    n_trials: int = 1,
    progress: bool = False,
    solver: Any = None,
 ) -> None:
    if solver is None:
        solver = GurobiPyomoSolver()
    benchmark = BenchmarkRunner(
        solvers={
            "baseline": LearningSolver(
                solver=solver.clone(),
            ),
            "ml-exact": LearningSolver(
                solver=solver.clone(),
            ),
            "ml-heuristic": LearningSolver(
                solver=solver.clone(),
                mode="heuristic",
            ),
        }
    )
    benchmark.solvers["baseline"].parallel_solve(
        train_instances,
        n_jobs=n_jobs,
        progress=progress,
    )
    benchmark.fit(
        train_instances,
        n_jobs=n_jobs,
        progress=progress,
    )
    benchmark.parallel_solve(
        test_instances,
        n_jobs=n_jobs,
        n_trials=n_trials,
        progress=progress,
    )
    plot(benchmark.results)
--- a/miplearn/classifiers/init.py
+++ b/miplearn/classifiers/init.py
@@ -1,163 +1,3 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
-#  Copyright (C) 2020-2021, UChicago Argonne, LLC. All rights reserved.
+#  Copyright (C) 2020-2022, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 from abc import ABC, abstractmethod
 from typing import Optional
 import numpy as np
 class Classifier(ABC):
    """
    A Classifier decides which class each sample belongs to, based on historical
    data.
    """
    def __init__(self) -> None:
        self.n_features: Optional[int] = None
        self.n_classes: Optional[int] = None
    @abstractmethod
    def fit(self, x_train: np.ndarray, y_train: np.ndarray) -> None:
        """
        Trains the classifier.
        Parameters
        ----------
        x_train: np.ndarray
            An array of features with shape (`n_samples`, `n_features`). Each entry
            must be a float.
        y_train: np.ndarray
            An array of labels with shape (`n_samples`, `n_classes`). Each entry must be
            a bool, and there must be exactly one True element in each row.
        """
        assert isinstance(x_train, np.ndarray)
        assert isinstance(y_train, np.ndarray)
        assert x_train.dtype in [
            np.float16,
            np.float32,
            np.float64,
        ], f"x_train.dtype should be float. Found {x_train.dtype} instead."
        assert y_train.dtype == np.bool8
        assert len(x_train.shape) == 2
        assert len(y_train.shape) == 2
        (n_samples_x, n_features) = x_train.shape
        (n_samples_y, n_classes) = y_train.shape
        assert n_samples_y == n_samples_x
        self.n_features = n_features
        self.n_classes = n_classes
    @abstractmethod
    def predict_proba(self, x_test: np.ndarray) -> np.ndarray:
        """
        Predicts the probability of each sample belonging to each class. Must be called
        after fit.
        Parameters
        ----------
        x_test: np.ndarray
            An array of features with shape (`n_samples`, `n_features`). The number of
            features in `x_test` must match the number of features in `x_train` provided
            to `fit`.
        Returns
        -------
        np.ndarray
            An array of predicted probabilities with shape (`n_samples`, `n_classes`),
            where `n_classes` is the number of columns in `y_train` provided to `fit`.
        """
        assert self.n_features is not None
        assert isinstance(x_test, np.ndarray)
        assert len(x_test.shape) == 2
        (n_samples, n_features_x) = x_test.shape
        assert n_features_x == self.n_features, (
            f"Test and training data have different number of "
            f"features: {n_features_x} != {self.n_features}"
        )
        return np.ndarray([])
    @abstractmethod
    def clone(self) -> "Classifier":
        """
        Returns an unfitted copy of this classifier with the same hyperparameters.
        """
        pass
 class Regressor(ABC):
    """
    A Regressor tries to predict the values of some continous variables, given the
    values of other variables.
    """
    def __init__(self) -> None:
        self.n_inputs: Optional[int] = None
    @abstractmethod
    def fit(self, x_train: np.ndarray, y_train: np.ndarray) -> None:
        """
        Trains the regressor.
        Parameters
        ----------
        x_train: np.ndarray
            An array of inputs with shape (`n_samples`, `n_inputs`). Each entry must be
            a float.
        y_train: np.ndarray
            An array of outputs with shape (`n_samples`, `n_outputs`). Each entry must
            be a float.
        """
        assert isinstance(x_train, np.ndarray)
        assert isinstance(y_train, np.ndarray)
        assert x_train.dtype in [np.float16, np.float32, np.float64]
        assert y_train.dtype in [np.float16, np.float32, np.float64]
        assert len(x_train.shape) == 2, (
            f"Parameter x_train should be a square matrix. "
            f"Found {x_train.shape} ndarray instead."
        )
        assert len(y_train.shape) == 2, (
            f"Parameter y_train should be a square matrix. "
            f"Found {y_train.shape} ndarray instead."
        )
        (n_samples_x, n_inputs) = x_train.shape
        (n_samples_y, n_outputs) = y_train.shape
        assert n_samples_y == n_samples_x
        self.n_inputs = n_inputs
    @abstractmethod
    def predict(self, x_test: np.ndarray) -> np.ndarray:
        """
        Predicts the values of the output variables. Must be called after fit.
        Parameters
        ----------
        x_test: np.ndarray
            An array of inputs with shape (`n_samples`, `n_inputs`), where `n_inputs`
            must match the number of columns in `x_train` provided to `fit`.
        Returns
        -------
        np.ndarray
            An array of outputs  with shape (`n_samples`, `n_outputs`), where
            `n_outputs` is the number of columns in `y_train` provided to `fit`.
        """
        assert self.n_inputs is not None
        assert isinstance(x_test, np.ndarray), (
            f"Parameter x_train must be np.ndarray. "
            f"Found {x_test.__class__.__name__} instead."
        )
        assert len(x_test.shape) == 2
        (n_samples, n_inputs_x) = x_test.shape
        assert n_inputs_x == self.n_inputs, (
            f"Test and training data have different number of "
            f"inputs: {n_inputs_x} != {self.n_inputs}"
        )
        return np.ndarray([])
    @abstractmethod
    def clone(self) -> "Regressor":
        """
        Returns an unfitted copy of this regressor with the same hyperparameters.
        """
        pass
--- a/miplearn/classifiers/adaptive.py
+++ b/miplearn/classifiers/adaptive.py
@@ -1,135 +0,0 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2021, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 import logging
 from typing import Dict, Optional
 import numpy as np
 from sklearn.ensemble import RandomForestClassifier
 from sklearn.linear_model import LogisticRegression
 from sklearn.metrics import roc_auc_score
 from sklearn.model_selection import cross_val_predict
 from sklearn.neighbors import KNeighborsClassifier
 from sklearn.pipeline import make_pipeline
 from sklearn.preprocessing import StandardScaler
 from miplearn.classifiers import Classifier
 from miplearn.classifiers.counting import CountingClassifier
 from miplearn.classifiers.sklearn import ScikitLearnClassifier
 logger = logging.getLogger(__name__)
 class CandidateClassifierSpecs:
    """
    Specifications describing how to construct a certain classifier, and under
    which circumstances it can be used.
    Parameters
    ----------
    min_samples: int
        Minimum number of samples for this classifier to be considered.
    classifier: Callable[[], Classifier]
        Callable that constructs the classifier.
    """
    def __init__(
        self,
        classifier: Classifier,
        min_samples: int = 0,
    ) -> None:
        self.min_samples = min_samples
        self.classifier = classifier
 class AdaptiveClassifier(Classifier):
    """
    A meta-classifier which dynamically selects what actual classifier to use
    based on its cross-validation score on a particular training data set.
    Parameters
    ----------
    candidates: Dict[str, CandidateClassifierSpecs]
        A dictionary of candidate classifiers to consider, mapping the name of the
        candidate to its specs, which describes how to construct it and under what
        scenarios. If no candidates are provided, uses a fixed set of defaults,
        which includes `CountingClassifier`, `KNeighborsClassifier` and
        `LogisticRegression`.
    """
    def __init__(
        self,
        candidates: Optional[Dict[str, CandidateClassifierSpecs]] = None,
    ) -> None:
        super().__init__()
        if candidates is None:
            candidates = {
                "forest(5,10)": CandidateClassifierSpecs(
                    classifier=ScikitLearnClassifier(
                        RandomForestClassifier(
                            n_estimators=5,
                            min_samples_split=10,
                        ),
                    ),
                    min_samples=100,
                ),
                "knn(100)": CandidateClassifierSpecs(
                    classifier=ScikitLearnClassifier(
                        KNeighborsClassifier(n_neighbors=100)
                    ),
                    min_samples=100,
                ),
                "logistic": CandidateClassifierSpecs(
                    classifier=ScikitLearnClassifier(
                        make_pipeline(
                            StandardScaler(),
                            LogisticRegression(),
                        )
                    ),
                    min_samples=30,
                ),
                "counting": CandidateClassifierSpecs(
                    classifier=CountingClassifier(),
                ),
            }
        self.candidates = candidates
        self.classifier: Optional[Classifier] = None
    def fit(self, x_train: np.ndarray, y_train: np.ndarray) -> None:
        super().fit(x_train, y_train)
        n_samples = x_train.shape[0]
        assert y_train.shape == (n_samples, 2)
        # If almost all samples belong to the same class, return a fixed prediction and
        # skip all the other steps.
        if y_train[:, 0].mean() > 0.99 or y_train[:, 1].mean() > 0.99:
            self.classifier = CountingClassifier()
            self.classifier.fit(x_train, y_train)
            return
        best_name, best_clf, best_score = None, None, -float("inf")
        for (name, specs) in self.candidates.items():
            if n_samples < specs.min_samples:
                continue
            clf = specs.classifier.clone()
            if isinstance(clf, ScikitLearnClassifier):
                proba = cross_val_predict(clf.inner_clf, x_train, y_train[:, 1])
            else:
                clf.fit(x_train, y_train)
                proba = clf.predict_proba(x_train)[:, 1]
            score = roc_auc_score(y_train[:, 1], proba)
            if score > best_score:
                best_name, best_clf, best_score = name, clf, score
        logger.debug("Best classifier: %s (score=%.3f)" % (best_name, best_score))
        if isinstance(best_clf, ScikitLearnClassifier):
            best_clf.fit(x_train, y_train)
        self.classifier = best_clf
    def predict_proba(self, x_test: np.ndarray) -> np.ndarray:
        super().predict_proba(x_test)
        assert self.classifier is not None
        return self.classifier.predict_proba(x_test)
    def clone(self) -> "AdaptiveClassifier":
        return AdaptiveClassifier(self.candidates)
--- a/miplearn/classifiers/counting.py
+++ b/miplearn/classifiers/counting.py
@@ -1,45 +0,0 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2021, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 from typing import Optional, cast
 import numpy as np
 from miplearn.classifiers import Classifier
 class CountingClassifier(Classifier):
    """
    A classifier that generates constant predictions, based only on the frequency of
    the training labels. For example, suppose `y_train` is given by:
    ```python
    y_train = np.array([
        [True, False],
        [False, True],
        [False, True],
    ])
    ```
    Then `predict_proba` always returns `[0.33 0.66]` for every sample, regardless of
    `x_train`. It essentially counts how many times each label appeared, hence the name.
    """
    def __init__(self) -> None:
        super().__init__()
        self.mean: Optional[np.ndarray] = None
    def fit(self, x_train: np.ndarray, y_train: np.ndarray) -> None:
        super().fit(x_train, y_train)
        self.mean = cast(np.ndarray, np.mean(y_train, axis=0))
    def predict_proba(self, x_test: np.ndarray) -> np.ndarray:
        super().predict_proba(x_test)
        n_samples = x_test.shape[0]
        return np.array([self.mean for _ in range(n_samples)])
    def __repr__(self) -> str:
        return "CountingClassifier(mean=%s)" % self.mean
    def clone(self) -> "CountingClassifier":
        return CountingClassifier()
--- a/miplearn/classifiers/cv.py
+++ b/miplearn/classifiers/cv.py
@@ -1,132 +0,0 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2021, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 import logging
 from typing import Optional, List
 import numpy as np
 from sklearn.dummy import DummyClassifier
 from sklearn.linear_model import LogisticRegression
 from sklearn.model_selection import cross_val_score
 from miplearn.classifiers import Classifier
 from miplearn.classifiers.sklearn import ScikitLearnClassifier
 logger = logging.getLogger(__name__)
 class CrossValidatedClassifier(Classifier):
    """
    A meta-classifier that, upon training, evaluates the performance of another
    candidate classifier on the training data set, using k-fold cross validation,
    then either adopts it, if its cv-score is high enough, or returns constant
    predictions for every x_test, otherwise.
    Parameters
    ----------
    classifier: Callable[[], ScikitLearnClassifier]
        A callable that constructs the candidate classifier.
    threshold: float
        Number from zero to one indicating how well must the candidate classifier
        perform to be adopted. The threshold is specified in comparison to a dummy
        classifier trained on the same dataset. For example, a threshold of 0.0
        indicates that any classifier as good as the dummy predictor is acceptable. A
        threshold of 1.0 indicates that only classifiers with perfect
        cross-validation scores are acceptable. Other numbers are a linear
        interpolation of these two extremes.
    constant: Optional[List[bool]]
        If the candidate classifier fails to meet the threshold, use a dummy classifier
        which always returns this prediction instead. The list should have exactly as
        many elements as the number of columns of `x_train` provided to `fit`.
    cv: int
        Number of folds.
    scoring: str
        Scoring function.
    """
    def __init__(
        self,
        classifier: ScikitLearnClassifier = ScikitLearnClassifier(LogisticRegression()),
        threshold: float = 0.75,
        constant: Optional[List[bool]] = None,
        cv: int = 5,
        scoring: str = "accuracy",
    ):
        super().__init__()
        if constant is None:
            constant = [True, False]
        self.n_classes = len(constant)
        self.classifier: Optional[ScikitLearnClassifier] = None
        self.classifier_prototype = classifier
        self.constant: List[bool] = constant
        self.threshold = threshold
        self.cv = cv
        self.scoring = scoring
    def fit(self, x_train: np.ndarray, y_train: np.ndarray) -> None:
        super().fit(x_train, y_train)
        (n_samples, n_classes) = x_train.shape
        assert n_classes == self.n_classes
        # Calculate dummy score and absolute score threshold
        y_train_avg = np.average(y_train)
        dummy_score = max(y_train_avg, 1 - y_train_avg)
        absolute_threshold = 1.0 * self.threshold + dummy_score * (1 - self.threshold)
        # Calculate cross validation score and decide which classifier to use
        clf = self.classifier_prototype.clone()
        assert clf is not None
        assert isinstance(clf, ScikitLearnClassifier), (
            f"The provided classifier callable must return a ScikitLearnClassifier. "
            f"Found {clf.__class__.__name__} instead. If this is a scikit-learn "
            f"classifier, you must wrap it with ScikitLearnClassifier."
        )
        cv_score = float(
            np.mean(
                cross_val_score(
                    clf.inner_clf,
                    x_train,
                    y_train[:, 1],
                    cv=self.cv,
                    scoring=self.scoring,
                )
            )
        )
        if cv_score >= absolute_threshold:
            logger.debug(
                "cv_score is above threshold (%.2f >= %.2f); keeping"
                % (cv_score, absolute_threshold)
            )
            self.classifier = clf
        else:
            logger.debug(
                "cv_score is below threshold (%.2f < %.2f); discarding"
                % (cv_score, absolute_threshold)
            )
            self.classifier = ScikitLearnClassifier(
                DummyClassifier(
                    strategy="constant",
                    constant=self.constant[1],
                )
            )
        # Train chosen classifier
        assert self.classifier is not None
        assert isinstance(self.classifier, ScikitLearnClassifier)
        self.classifier.fit(x_train, y_train)
    def predict_proba(self, x_test: np.ndarray) -> np.ndarray:
        super().predict_proba(x_test)
        assert self.classifier is not None
        return self.classifier.predict_proba(x_test)
    def clone(self) -> "CrossValidatedClassifier":
        return CrossValidatedClassifier(
            classifier=self.classifier_prototype,
            threshold=self.threshold,
            constant=self.constant,
            cv=self.cv,
            scoring=self.scoring,
        )
--- a/miplearn/classifiers/minprob.py
+++ b/miplearn/classifiers/minprob.py
@@ -0,0 +1,61 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2022, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 from typing import List, Any, Callable, Optional
 import numpy as np
 import sklearn
 from sklearn.base import BaseEstimator
 from sklearn.utils.multiclass import unique_labels
 class MinProbabilityClassifier(BaseEstimator):
    """
    Meta-classifier that returns NaN for predictions made by a base classifier that
    have probability below a given threshold. More specifically, this meta-classifier
    calls base_clf.predict_proba and compares the result against the provided
    thresholds. If the probability for one of the classes is above its threshold,
    the meta-classifier returns that prediction. Otherwise, it returns NaN.
    """
    def __init__(
        self,
        base_clf: Any,
        thresholds: List[float],
        clone_fn: Callable[[Any], Any] = sklearn.base.clone,
    ) -> None:
        assert len(thresholds) == 2
        self.base_clf = base_clf
        self.thresholds = thresholds
        self.clone_fn = clone_fn
        self.clf_: Optional[Any] = None
        self.classes_: Optional[List[Any]] = None
    def fit(self, x: np.ndarray, y: np.ndarray) -> None:
        assert len(y.shape) == 1
        assert len(x.shape) == 2
        classes = unique_labels(y)
        assert len(classes) == len(self.thresholds)
        self.clf_ = self.clone_fn(self.base_clf)
        self.clf_.fit(x, y)
        self.classes_ = self.clf_.classes_
    def predict(self, x: np.ndarray) -> np.ndarray:
        assert self.clf_ is not None
        assert self.classes_ is not None
        y_proba = self.clf_.predict_proba(x)
        assert len(y_proba.shape) == 2
        assert y_proba.shape[0] == x.shape[0]
        assert y_proba.shape[1] == 2
        n_samples = x.shape[0]
        y_pred = []
        for sample_idx in range(n_samples):
            yi = float("nan")
            for (class_idx, class_val) in enumerate(self.classes_):
                if y_proba[sample_idx, class_idx] >= self.thresholds[class_idx]:
                    yi = class_val
            y_pred.append(yi)
        return np.array(y_pred)
--- a/miplearn/classifiers/singleclass.py
+++ b/miplearn/classifiers/singleclass.py
@@ -0,0 +1,51 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2022, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 from typing import Callable, Optional
 import numpy as np
 import sklearn.base
 from sklearn.base import BaseEstimator
 from sklearn.utils.multiclass import unique_labels
 class SingleClassFix(BaseEstimator):
    """
    Some sklearn classifiers, such as logistic regression, have issues with datasets
    that contain a single class. This meta-classifier fixes the issue. If the
    training data contains a single class, this meta-classifier always returns that
    class as a prediction. Otherwise, it fits the provided base classifier,
    and returns its predictions instead.
    """
    def __init__(
        self,
        base_clf: BaseEstimator,
        clone_fn: Callable = sklearn.base.clone,
    ):
        self.base_clf = base_clf
        self.clf_: Optional[BaseEstimator] = None
        self.constant_ = None
        self.classes_ = None
        self.clone_fn = clone_fn
    def fit(self, x: np.ndarray, y: np.ndarray) -> None:
        classes = unique_labels(y)
        if len(classes) == 1:
            assert classes[0] is not None
            self.clf_ = None
            self.constant_ = classes[0]
            self.classes_ = classes
        else:
            self.clf_ = self.clone_fn(self.base_clf)
            assert self.clf_ is not None
            self.clf_.fit(x, y)
            self.constant_ = None
            self.classes_ = self.clf_.classes_
    def predict(self, x: np.ndarray) -> np.ndarray:
        if self.constant_ is not None:
            return np.full(x.shape[0], self.constant_)
        else:
            assert self.clf_ is not None
            return self.clf_.predict(x)
--- a/miplearn/classifiers/sklearn.py
+++ b/miplearn/classifiers/sklearn.py
@@ -1,93 +0,0 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2021, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 from typing import Optional, Any, cast
 import numpy as np
 import sklearn
 from miplearn.classifiers import Classifier, Regressor
 class ScikitLearnClassifier(Classifier):
    """
    Wrapper for ScikitLearn classifiers, which makes sure inputs and outputs have the
    correct dimensions and types.
    """
    def __init__(self, clf: Any) -> None:
        super().__init__()
        self.inner_clf = clf
        self.constant: Optional[np.ndarray] = None
    def fit(self, x_train: np.ndarray, y_train: np.ndarray) -> None:
        super().fit(x_train, y_train)
        (n_samples, n_classes) = y_train.shape
        assert n_classes == 2, (
            f"Scikit-learn classifiers must have exactly two classes. "
            f"{n_classes} classes were provided instead."
        )
        # When all samples belong to the same class, sklearn's predict_proba returns
        # an array with a single column. The following check avoid this strange
        # behavior.
        mean = cast(np.ndarray, y_train.astype(float).mean(axis=0))
        if mean.max() == 1.0:
            self.constant = mean
            return
        self.inner_clf.fit(x_train, y_train[:, 1])
    def predict_proba(self, x_test: np.ndarray) -> np.ndarray:
        super().predict_proba(x_test)
        n_samples = x_test.shape[0]
        if self.constant is not None:
            return np.array([self.constant for n in range(n_samples)])
        sklearn_proba = self.inner_clf.predict_proba(x_test)
        if isinstance(sklearn_proba, list):
            assert len(sklearn_proba) == self.n_classes
            for pb in sklearn_proba:
                assert isinstance(pb, np.ndarray)
                assert pb.dtype in [np.float16, np.float32, np.float64]
                assert pb.shape == (n_samples, 2)
            proba = np.hstack([pb[:, [1]] for pb in sklearn_proba])
            assert proba.shape == (n_samples, self.n_classes)
            return proba
        else:
            assert isinstance(sklearn_proba, np.ndarray)
            assert sklearn_proba.shape == (n_samples, 2)
            return sklearn_proba
    def clone(self) -> "ScikitLearnClassifier":
        return ScikitLearnClassifier(
            clf=sklearn.base.clone(self.inner_clf),
        )
 class ScikitLearnRegressor(Regressor):
    """
    Wrapper for ScikitLearn regressors, which makes sure inputs and outputs have the
    correct dimensions and types.
    """
    def __init__(self, reg: Any) -> None:
        super().__init__()
        self.inner_reg = reg
    def fit(self, x_train: np.ndarray, y_train: np.ndarray) -> None:
        super().fit(x_train, y_train)
        self.inner_reg.fit(x_train, y_train)
    def predict(self, x_test: np.ndarray) -> np.ndarray:
        super().predict(x_test)
        n_samples = x_test.shape[0]
        sklearn_pred = self.inner_reg.predict(x_test)
        assert isinstance(sklearn_pred, np.ndarray)
        assert sklearn_pred.shape[0] == n_samples
        return sklearn_pred
    def clone(self) -> "ScikitLearnRegressor":
        return ScikitLearnRegressor(
            reg=sklearn.base.clone(self.inner_reg),
        )
--- a/miplearn/classifiers/threshold.py
+++ b/miplearn/classifiers/threshold.py
@@ -1,143 +0,0 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2021, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 from abc import abstractmethod, ABC
 from typing import Optional, List
 import numpy as np
 from sklearn.metrics._ranking import _binary_clf_curve
 from sklearn.model_selection import cross_val_predict
 from miplearn.classifiers.sklearn import ScikitLearnClassifier
 from miplearn.classifiers.adaptive import AdaptiveClassifier
 from miplearn.classifiers import Classifier
 class Threshold(ABC):
    """
    Solver components ask the machine learning models how confident are they on each
    prediction they make, then automatically discard all predictions that have low
    confidence. A Threshold specifies how confident should the ML models be for a
    prediction to be considered trustworthy.
    To model dynamic thresholds, which automatically adjust themselves during
    training to reach some desired target (such as minimum precision, or minimum
    recall), thresholds behave somewhat similar to ML models themselves, with `fit`
    and `predict` methods.
    """
    @abstractmethod
    def fit(
        self,
        clf: Classifier,
        x_train: np.ndarray,
        y_train: np.ndarray,
    ) -> None:
        """
        Given a trained binary classifier `clf`, calibrates itself based on the
        classifier's performance on the given training data set.
        """
        assert isinstance(clf, Classifier)
        assert isinstance(x_train, np.ndarray)
        assert isinstance(y_train, np.ndarray)
        n_samples = x_train.shape[0]
        assert y_train.shape[0] == n_samples
    @abstractmethod
    def predict(self, x_test: np.ndarray) -> List[float]:
        """
        Returns the minimum probability for a machine learning prediction to be
        considered trustworthy. There is one value for each label.
        """
        pass
    @abstractmethod
    def clone(self) -> "Threshold":
        """
        Returns an unfitted copy of this threshold with the same hyperparameters.
        """
        pass
 class MinProbabilityThreshold(Threshold):
    """
    A threshold which considers predictions trustworthy if their probability of being
    correct, as computed by the machine learning models, are above a fixed value.
    """
    def __init__(self, min_probability: List[float]):
        self.min_probability = min_probability
    def fit(self, clf: Classifier, x_train: np.ndarray, y_train: np.ndarray) -> None:
        pass
    def predict(self, x_test: np.ndarray) -> List[float]:
        return self.min_probability
    def clone(self) -> "MinProbabilityThreshold":
        return MinProbabilityThreshold(self.min_probability)
 class MinPrecisionThreshold(Threshold):
    """
    A dynamic threshold which automatically adjusts itself during training to ensure
    that the component achieves at least a given precision `p` on the training data
    set. Note that increasing a component's minimum precision may reduce its recall.
    """
    def __init__(self, min_precision: List[float]) -> None:
        self.min_precision = min_precision
        self._computed_threshold: Optional[List[float]] = None
    def fit(
        self,
        clf: Classifier,
        x_train: np.ndarray,
        y_train: np.ndarray,
    ) -> None:
        super().fit(clf, x_train, y_train)
        (n_samples, n_classes) = y_train.shape
        if isinstance(clf, AdaptiveClassifier) and isinstance(
            clf.classifier, ScikitLearnClassifier
        ):
            proba = cross_val_predict(
                clf.classifier.inner_clf,
                x_train,
                y_train[:, 1],
                method="predict_proba",
            )
        else:
            proba = clf.predict_proba(x_train)
        self._computed_threshold = [
            self._compute(
                y_train[:, i],
                proba[:, i],
                self.min_precision[i],
            )
            for i in range(n_classes)
        ]
    def predict(self, x_test: np.ndarray) -> List[float]:
        assert self._computed_threshold is not None
        return self._computed_threshold
    @staticmethod
    def _compute(
        y_actual: np.ndarray,
        y_prob: np.ndarray,
        min_precision: float,
        min_recall: float = 0.1,
    ) -> float:
        fps, tps, thresholds = _binary_clf_curve(y_actual, y_prob)
        precision = tps / (tps + fps)
        recall = tps / tps[-1]
        for k in reversed(range(len(precision))):
            if precision[k] >= min_precision and recall[k] >= min_recall:
                return thresholds[k]
        return float("inf")
    def clone(self) -> "MinPrecisionThreshold":
        return MinPrecisionThreshold(
            min_precision=self.min_precision,
        )
--- a/miplearn/collectors/init.py
+++ b/miplearn/collectors/init.py
--- a/miplearn/collectors/basic.py
+++ b/miplearn/collectors/basic.py
@@ -0,0 +1,86 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2022, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 import json
 import os
 from io import StringIO
 from os.path import exists
 from typing import Callable, List
 from ..h5 import H5File
 from ..io import _RedirectOutput, gzip, _to_h5_filename
 from ..parallel import p_umap
 class BasicCollector:
    def collect(
        self,
        filenames: List[str],
        build_model: Callable,
        n_jobs: int = 1,
        progress: bool = False,
    ) -> None:
        def _collect(data_filename):
            h5_filename = _to_h5_filename(data_filename)
            mps_filename = h5_filename.replace(".h5", ".mps")
            if exists(h5_filename):
                # Try to read optimal solution
                mip_var_values = None
                try:
                    with H5File(h5_filename, "r") as h5:
                        mip_var_values = h5.get_array("mip_var_values")
                except:
                    pass
                if mip_var_values is None:
                    print(f"Removing empty/corrupted h5 file: {h5_filename}")
                    os.remove(h5_filename)
                else:
                    return
            with H5File(h5_filename, "w") as h5:
                streams = [StringIO()]
                with _RedirectOutput(streams):
                    # Load and extract static features
                    model = build_model(data_filename)
                    model.extract_after_load(h5)
                    # Solve LP relaxation
                    relaxed = model.relax()
                    relaxed.optimize()
                    relaxed.extract_after_lp(h5)
                    # Solve MIP
                    model.optimize()
                    model.extract_after_mip(h5)
                    # Add lazy constraints to model
                    if (
                        hasattr(model, "fix_violations")
                        and model.fix_violations is not None
                    ):
                        model.fix_violations(model, model.violations_, "aot")
                        h5.put_scalar(
                            "mip_constr_violations", json.dumps(model.violations_)
                        )
                    # Save MPS file
                    model.write(mps_filename)
                    gzip(mps_filename)
                h5.put_scalar("mip_log", streams[0].getvalue())
        if n_jobs > 1:
            p_umap(
                _collect,
                filenames,
                num_cpus=n_jobs,
                desc="collect",
                smoothing=0,
                disable=not progress,
            )
        else:
            for filename in filenames:
                _collect(filename)
--- a/miplearn/collectors/lazy.py
+++ b/miplearn/collectors/lazy.py
@@ -0,0 +1,117 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2022, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 from io import StringIO
 from typing import Callable
 import gurobipy as gp
 import numpy as np
 from gurobipy import GRB, LinExpr
 from ..h5 import H5File
 from ..io import _RedirectOutput
 class LazyCollector:
    def __init__(
        self,
        min_constrs: int = 100_000,
        time_limit: float = 900,
    ) -> None:
        self.min_constrs = min_constrs
        self.time_limit = time_limit
    def collect(
        self, data_filename: str, build_model: Callable, tol: float = 1e-6
    ) -> None:
        h5_filename = f"{data_filename}.h5"
        with H5File(h5_filename, "r+") as h5:
            streams = [StringIO()]
            lazy = None
            with _RedirectOutput(streams):
                slacks = h5.get_array("mip_constr_slacks")
                assert slacks is not None
                # Check minimum problem size
                if len(slacks) < self.min_constrs:
                    print("Problem is too small. Skipping.")
                    h5.put_array("mip_constr_lazy", np.zeros(len(slacks)))
                    return
                # Load model
                print("Loading model...")
                model = build_model(data_filename)
                model.params.LazyConstraints = True
                model.params.timeLimit = self.time_limit
                gp_constrs = np.array(model.getConstrs())
                gp_vars = np.array(model.getVars())
                # Load constraints
                lhs = h5.get_sparse("static_constr_lhs")
                rhs = h5.get_array("static_constr_rhs")
                sense = h5.get_array("static_constr_sense")
                assert lhs is not None
                assert rhs is not None
                assert sense is not None
                lhs_csr = lhs.tocsr()
                lhs_csc = lhs.tocsc()
                constr_idx = np.array(range(len(rhs)))
                lazy = np.zeros(len(rhs))
                # Drop loose constraints
                selected = (slacks > 0) & ((sense == b"<") | (sense == b">"))
                loose_constrs = gp_constrs[selected]
                print(
                    f"Removing {len(loose_constrs):,d} constraints (out of {len(rhs):,d})..."
                )
                model.remove(list(loose_constrs))
                # Filter to constraints that were dropped
                lhs_csr = lhs_csr[selected, :]
                lhs_csc = lhs_csc[selected, :]
                rhs = rhs[selected]
                sense = sense[selected]
                constr_idx = constr_idx[selected]
                lazy[selected] = 1
                # Load warm start
                var_names = h5.get_array("static_var_names")
                var_values = h5.get_array("mip_var_values")
                assert var_values is not None
                assert var_names is not None
                for (var_idx, var_name) in enumerate(var_names):
                    var = model.getVarByName(var_name.decode())
                    var.start = var_values[var_idx]
                print("Solving MIP with lazy constraints callback...")
                def callback(model: gp.Model, where: int) -> None:
                    assert rhs is not None
                    assert lazy is not None
                    assert sense is not None
                    if where == GRB.Callback.MIPSOL:
                        x_val = np.array(model.cbGetSolution(model.getVars()))
                        slack = lhs_csc * x_val - rhs
                        slack[sense == b">"] *= -1
                        is_violated = slack > tol
                        for (j, rhs_j) in enumerate(rhs):
                            if is_violated[j]:
                                lazy[constr_idx[j]] = 0
                                expr = LinExpr(
                                    lhs_csr[j, :].data, gp_vars[lhs_csr[j, :].indices]
                                )
                                if sense[j] == b"<":
                                    model.cbLazy(expr <= rhs_j)
                                elif sense[j] == b">":
                                    model.cbLazy(expr >= rhs_j)
                                else:
                                    raise RuntimeError(f"Unknown sense: {sense[j]}")
                model.optimize(callback)
                print(f"Marking {lazy.sum():,.0f} constraints as lazy...")
            h5.put_array("mip_constr_lazy", lazy)
            h5.put_scalar("mip_constr_lazy_log", streams[0].getvalue())
--- a/miplearn/collectors/priority.py
+++ b/miplearn/collectors/priority.py
@@ -0,0 +1,49 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2022, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 import os
 import subprocess
 from typing import Callable
 from ..h5 import H5File
 class BranchPriorityCollector:
    def __init__(
        self,
        time_limit: float = 900.0,
        print_interval: int = 1,
        node_limit: int = 500,
    ) -> None:
        self.time_limit = time_limit
        self.print_interval = print_interval
        self.node_limit = node_limit
    def collect(self, data_filename: str, _: Callable) -> None:
        basename = data_filename.replace(".pkl.gz", "")
        env = os.environ.copy()
        env["JULIA_NUM_THREADS"] = "1"
        ret = subprocess.run(
            [
                "julia",
                "--project=.",
                "-e",
                (
                    f"using CPLEX, JuMP, MIPLearn.BB; "
                    f"BB.solve!("
                    f'    optimizer_with_attributes(CPLEX.Optimizer, "CPXPARAM_Threads" => 1),'
                    f'    "{basename}",'
                    f"    print_interval={self.print_interval},"
                    f"    time_limit={self.time_limit:.2f},"
                    f"    node_limit={self.node_limit},"
                    f")"
                ),
            ],
            check=True,
            capture_output=True,
            env=env,
        )
        h5_filename = f"{basename}.h5"
        with H5File(h5_filename, "r+") as h5:
            h5.put_scalar("bb_log", ret.stdout)
--- a/miplearn/components/init.py
+++ b/miplearn/components/init.py
@@ -1,47 +1,3 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
-#  Copyright (C) 2020-2021, UChicago Argonne, LLC. All rights reserved.
+#  Copyright (C) 2020-2022, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 from typing import Dict
 def classifier_evaluation_dict(
    tp: int,
    tn: int,
    fp: int,
    fn: int,
 ) -> Dict[str, float]:
    p = tp + fn
    n = fp + tn
    d: Dict = {
        "Predicted positive": fp + tp,
        "Predicted negative": fn + tn,
        "Condition positive": p,
        "Condition negative": n,
        "True positive": tp,
        "True negative": tn,
        "False positive": fp,
        "False negative": fn,
        "Accuracy": (tp + tn) / (p + n),
        "F1 score": (2 * tp) / (2 * tp + fp + fn),
    }
    if p > 0:
        d["Recall"] = tp / p
    else:
        d["Recall"] = 1.0
    if tp + fp > 0:
        d["Precision"] = tp / (tp + fp)
    else:
        d["Precision"] = 1.0
    t = (p + n) / 100.0
    d["Predicted positive (%)"] = d["Predicted positive"] / t
    d["Predicted negative (%)"] = d["Predicted negative"] / t
    d["Condition positive (%)"] = d["Condition positive"] / t
    d["Condition negative (%)"] = d["Condition negative"] / t
    d["True positive (%)"] = d["True positive"] / t
    d["True negative (%)"] = d["True negative"] / t
    d["False positive (%)"] = d["False positive"] / t
    d["False negative (%)"] = d["False negative"] / t
    return d
--- a/miplearn/components/component.py
+++ b/miplearn/components/component.py
@@ -1,269 +0,0 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2021, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 from typing import Any, List, TYPE_CHECKING, Tuple, Dict, Optional
 import numpy as np
 from tqdm.auto import tqdm
 from p_tqdm import p_umap
 from miplearn.features.sample import Sample
 from miplearn.instance.base import Instance
 from miplearn.types import LearningSolveStats, Category
 if TYPE_CHECKING:
    from miplearn.solvers.learning import LearningSolver
 # noinspection PyMethodMayBeStatic
 class Component:
    """
    A Component is an object which adds functionality to a LearningSolver.
    For better code maintainability, LearningSolver simply delegates most of its
    functionality to Components. Each Component is responsible for exactly one ML
    strategy.
    """
    def after_solve_lp(
        self,
        solver: "LearningSolver",
        instance: Instance,
        model: Any,
        stats: LearningSolveStats,
        sample: Sample,
    ) -> None:
        """
        Method called by LearningSolver after the root LP relaxation is solved.
        See before_solve_lp for a description of the parameters.
        """
        return
    def after_solve_mip(
        self,
        solver: "LearningSolver",
        instance: Instance,
        model: Any,
        stats: LearningSolveStats,
        sample: Sample,
    ) -> None:
        """
        Method called by LearningSolver after the MIP is solved.
        See before_solve_lp for a description of the parameters.
        """
        return
    def before_solve_lp(
        self,
        solver: "LearningSolver",
        instance: Instance,
        model: Any,
        stats: LearningSolveStats,
        sample: Sample,
    ) -> None:
        """
        Method called by LearningSolver before the root LP relaxation is solved.
        Parameters
        ----------
        solver: LearningSolver
            The solver calling this method.
        instance: Instance
            The instance being solved.
        model
            The concrete optimization model being solved.
        stats: LearningSolveStats
            A dictionary containing statistics about the solution process, such as
            number of nodes explored and running time. Components are free to add
            their own statistics here. For example, PrimalSolutionComponent adds
            statistics regarding the number of predicted variables. All statistics in
            this dictionary are exported to the benchmark CSV file.
        sample: miplearn.features.Sample
            An object containing data that may be useful for training machine
            learning models and accelerating the solution process. Components are
            free to add their own training data here.
        """
        return
    def before_solve_mip(
        self,
        solver: "LearningSolver",
        instance: Instance,
        model: Any,
        stats: LearningSolveStats,
        sample: Sample,
    ) -> None:
        """
        Method called by LearningSolver before the MIP is solved.
        See before_solve_lp for a description of the parameters.
        """
        return
    def fit_xy(
        self,
        x: Dict[Category, np.ndarray],
        y: Dict[Category, np.ndarray],
    ) -> None:
        """
        Given two dictionaries x and y, mapping the name of the category to matrices
        of features and targets, this function does two things. First, for each
        category, it creates a clone of the prototype regressor/classifier. Second,
        it passes (x[category], y[category]) to the clone's fit method.
        """
        return
    def iteration_cb(
        self,
        solver: "LearningSolver",
        instance: Instance,
        model: Any,
    ) -> bool:
        """
        Method called by LearningSolver at the end of each iteration.
        After solving the MIP, LearningSolver calls `iteration_cb` of each component,
        giving them a chance to modify the problem and resolve it before the solution
        process ends. For example, the lazy constraint component uses `iteration_cb`
        to check that all lazy constraints are satisfied.
        If `iteration_cb` returns False for all components, the solution process
        ends. If it retunrs True for any component, the MIP is solved again.
        Parameters
        ----------
        solver: LearningSolver
            The solver calling this method.
        instance: Instance
            The instance being solved.
        model: Any
            The concrete optimization model being solved.
        """
        return False
    def lazy_cb(
        self,
        solver: "LearningSolver",
        instance: Instance,
        model: Any,
    ) -> None:
        return
    def sample_evaluate(
        self,
        instance: Optional[Instance],
        sample: Sample,
    ) -> Dict[str, Dict[str, float]]:
        return {}
    def sample_xy(
        self,
        instance: Optional[Instance],
        sample: Sample,
    ) -> Tuple[Dict, Dict]:
        """
        Returns a pair of x and y dictionaries containing, respectively, the matrices
        of ML features and the labels for the sample. If the training sample does not
        include label information, returns (x, {}).
        """
        pass
    def pre_fit(self, pre: List[Any]) -> None:
        pass
    def user_cut_cb(
        self,
        solver: "LearningSolver",
        instance: Instance,
        model: Any,
    ) -> None:
        return
    def pre_sample_xy(self, instance: Instance, sample: Sample) -> Any:
        pass
    @staticmethod
    def fit_multiple(
        components: List["Component"],
        instances: List[Instance],
        n_jobs: int = 1,
        progress: bool = False,
    ) -> None:
        # Part I: Pre-fit
        def _pre_sample_xy(instance: Instance) -> Dict:
            pre_instance: Dict = {}
            for (cidx, comp) in enumerate(components):
                pre_instance[cidx] = []
            instance.load()
            for sample in instance.get_samples():
                for (cidx, comp) in enumerate(components):
                    pre_instance[cidx].append(comp.pre_sample_xy(instance, sample))
            instance.free()
            return pre_instance
        if n_jobs == 1:
            pre = [_pre_sample_xy(instance) for instance in instances]
        else:
            pre = p_umap(
                _pre_sample_xy,
                instances,
                num_cpus=n_jobs,
                desc="pre-sample-xy",
                disable=not progress,
            )
        pre_combined: Dict = {}
        for (cidx, comp) in enumerate(components):
            pre_combined[cidx] = []
            for p in pre:
                pre_combined[cidx].extend(p[cidx])
        for (cidx, comp) in enumerate(components):
            comp.pre_fit(pre_combined[cidx])
        # Part II: Fit
        def _sample_xy(instance: Instance) -> Tuple[Dict, Dict]:
            x_instance: Dict = {}
            y_instance: Dict = {}
            for (cidx, comp) in enumerate(components):
                x_instance[cidx] = {}
                y_instance[cidx] = {}
            instance.load()
            for sample in instance.get_samples():
                for (cidx, comp) in enumerate(components):
                    x = x_instance[cidx]
                    y = y_instance[cidx]
                    x_sample, y_sample = comp.sample_xy(instance, sample)
                    for cat in x_sample.keys():
                        if cat not in x:
                            x[cat] = []
                            y[cat] = []
                        x[cat] += x_sample[cat]
                        y[cat] += y_sample[cat]
            instance.free()
            return x_instance, y_instance
        if n_jobs == 1:
            xy = [_sample_xy(instance) for instance in instances]
        else:
            xy = p_umap(_sample_xy, instances, desc="sample-xy", disable=not progress)
        for (cidx, comp) in enumerate(
            tqdm(
                components,
                desc="fit",
                disable=not progress,
            )
        ):
            x_comp: Dict = {}
            y_comp: Dict = {}
            for (x, y) in xy:
                for cat in x[cidx].keys():
                    if cat not in x_comp:
                        x_comp[cat] = []
                        y_comp[cat] = []
                    x_comp[cat].extend(x[cidx][cat])
                    y_comp[cat].extend(y[cidx][cat])
            for cat in x_comp.keys():
                x_comp[cat] = np.array(x_comp[cat], dtype=np.float32)
                y_comp[cat] = np.array(y_comp[cat])
            comp.fit_xy(x_comp, y_comp)
--- a/miplearn/components/dynamic_common.py
+++ b/miplearn/components/dynamic_common.py
@@ -1,184 +0,0 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2021, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 import json
 import logging
 from typing import Dict, List, Tuple, Optional, Any, Set
 import numpy as np
 from overrides import overrides
 from miplearn.features.extractor import FeaturesExtractor
 from miplearn.classifiers import Classifier
 from miplearn.classifiers.threshold import Threshold
 from miplearn.components import classifier_evaluation_dict
 from miplearn.components.component import Component
 from miplearn.features.sample import Sample
 from miplearn.instance.base import Instance
 from miplearn.types import ConstraintCategory, ConstraintName
 logger = logging.getLogger(__name__)
 class DynamicConstraintsComponent(Component):
    """
    Base component used by both DynamicLazyConstraintsComponent and UserCutsComponent.
    """
    def __init__(
        self,
        attr: str,
        classifier: Classifier,
        threshold: Threshold,
    ):
        assert isinstance(classifier, Classifier)
        self.threshold_prototype: Threshold = threshold
        self.classifier_prototype: Classifier = classifier
        self.classifiers: Dict[ConstraintCategory, Classifier] = {}
        self.thresholds: Dict[ConstraintCategory, Threshold] = {}
        self.known_violations: Dict[ConstraintName, Any] = {}
        self.attr = attr
    def sample_xy_with_cids(
        self,
        instance: Optional[Instance],
        sample: Sample,
    ) -> Tuple[
        Dict[ConstraintCategory, List[List[float]]],
        Dict[ConstraintCategory, List[List[bool]]],
        Dict[ConstraintCategory, List[ConstraintName]],
    ]:
        if len(self.known_violations) == 0:
            return {}, {}, {}
        assert instance is not None
        x: Dict[ConstraintCategory, List[List[float]]] = {}
        y: Dict[ConstraintCategory, List[List[bool]]] = {}
        cids: Dict[ConstraintCategory, List[ConstraintName]] = {}
        known_cids = np.array(sorted(list(self.known_violations.keys())), dtype="S")
        enforced_cids = None
        enforced_encoded = sample.get_scalar(self.attr)
        if enforced_encoded is not None:
            enforced = self.decode(enforced_encoded)
            enforced_cids = list(enforced.keys())
        # Get user-provided constraint features
        (
            constr_features,
            constr_categories,
            constr_lazy,
        ) = FeaturesExtractor._extract_user_features_constrs(instance, known_cids)
        # Augment with instance features
        instance_features = sample.get_array("static_instance_features")
        assert instance_features is not None
        constr_features = np.hstack(
            [
                instance_features.reshape(1, -1).repeat(len(known_cids), axis=0),
                constr_features,
            ]
        )
        categories = np.unique(constr_categories)
        for c in categories:
            x[c] = constr_features[constr_categories == c].tolist()
            cids[c] = known_cids[constr_categories == c].tolist()
            if enforced_cids is not None:
                tmp = np.isin(cids[c], enforced_cids).reshape(-1, 1)
                y[c] = np.hstack([~tmp, tmp]).tolist()  # type: ignore
        return x, y, cids
    @overrides
    def sample_xy(
        self,
        instance: Optional[Instance],
        sample: Sample,
    ) -> Tuple[Dict, Dict]:
        x, y, _ = self.sample_xy_with_cids(instance, sample)
        return x, y
    @overrides
    def pre_fit(self, pre: List[Any]) -> None:
        assert pre is not None
        self.known_violations.clear()
        for violations in pre:
            for (vname, vdata) in violations.items():
                self.known_violations[vname] = vdata
    def sample_predict(
        self,
        instance: Instance,
        sample: Sample,
    ) -> List[ConstraintName]:
        pred: List[ConstraintName] = []
        if len(self.known_violations) == 0:
            logger.info("Classifiers not fitted. Skipping.")
            return pred
        x, _, cids = self.sample_xy_with_cids(instance, sample)
        for category in x.keys():
            assert category in self.classifiers
            assert category in self.thresholds
            clf = self.classifiers[category]
            thr = self.thresholds[category]
            nx = np.array(x[category])
            proba = clf.predict_proba(nx)
            t = thr.predict(nx)
            for i in range(proba.shape[0]):
                if proba[i][1] > t[1]:
                    pred += [cids[category][i]]
        return pred
    @overrides
    def pre_sample_xy(self, instance: Instance, sample: Sample) -> Any:
        attr_encoded = sample.get_scalar(self.attr)
        assert attr_encoded is not None
        return self.decode(attr_encoded)
    @overrides
    def fit_xy(
        self,
        x: Dict[ConstraintCategory, np.ndarray],
        y: Dict[ConstraintCategory, np.ndarray],
    ) -> None:
        for category in x.keys():
            self.classifiers[category] = self.classifier_prototype.clone()
            self.thresholds[category] = self.threshold_prototype.clone()
            npx = np.array(x[category])
            npy = np.array(y[category])
            self.classifiers[category].fit(npx, npy)
            self.thresholds[category].fit(self.classifiers[category], npx, npy)
    @overrides
    def sample_evaluate(
        self,
        instance: Instance,
        sample: Sample,
    ) -> Dict[str, float]:
        attr_encoded = sample.get_scalar(self.attr)
        assert attr_encoded is not None
        actual_violations = DynamicConstraintsComponent.decode(attr_encoded)
        actual = set(actual_violations.keys())
        pred = set(self.sample_predict(instance, sample))
        tp, tn, fp, fn = 0, 0, 0, 0
        for cid in self.known_violations.keys():
            if cid in pred:
                if cid in actual:
                    tp += 1
                else:
                    fp += 1
            else:
                if cid in actual:
                    fn += 1
                else:
                    tn += 1
        return classifier_evaluation_dict(tp=tp, tn=tn, fp=fp, fn=fn)
    @staticmethod
    def encode(violations: Dict[ConstraintName, Any]) -> str:
        return json.dumps({k.decode(): v for (k, v) in violations.items()})
    @staticmethod
    def decode(violations_encoded: str) -> Dict[ConstraintName, Any]:
        violations = json.loads(violations_encoded)
        return {k.encode(): v for (k, v) in violations.items()}
--- a/miplearn/components/dynamic_lazy.py
+++ b/miplearn/components/dynamic_lazy.py
@@ -1,223 +0,0 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2021, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 import json
 import logging
 from typing import Dict, List, TYPE_CHECKING, Tuple, Any, Optional
 import numpy as np
 from overrides import overrides
 from tqdm.auto import tqdm
 from miplearn.classifiers import Classifier
 from miplearn.classifiers.counting import CountingClassifier
 from miplearn.classifiers.threshold import MinProbabilityThreshold, Threshold
 from miplearn.components.component import Component
 from miplearn.components.dynamic_common import DynamicConstraintsComponent
 from miplearn.features.sample import Sample, Hdf5Sample
 from miplearn.instance.base import Instance
 from miplearn.types import LearningSolveStats, ConstraintName, ConstraintCategory
 from p_tqdm import p_map
 logger = logging.getLogger(__name__)
 if TYPE_CHECKING:
    from miplearn.solvers.learning import LearningSolver
 class DynamicLazyConstraintsComponent(Component):
    """
    A component that predicts which lazy constraints to enforce.
    """
    def __init__(
        self,
        classifier: Classifier = CountingClassifier(),
        threshold: Threshold = MinProbabilityThreshold([0, 0.05]),
    ):
        self.dynamic: DynamicConstraintsComponent = DynamicConstraintsComponent(
            classifier=classifier,
            threshold=threshold,
            attr="mip_constr_lazy",
        )
        self.classifiers = self.dynamic.classifiers
        self.thresholds = self.dynamic.thresholds
        self.known_violations = self.dynamic.known_violations
        self.lazy_enforced: Dict[ConstraintName, Any] = {}
        self.n_iterations: int = 0
    @staticmethod
    def enforce(
        violations: Dict[ConstraintName, Any],
        instance: Instance,
        model: Any,
        solver: "LearningSolver",
    ) -> None:
        assert solver.internal_solver is not None
        for (vname, vdata) in violations.items():
            instance.enforce_lazy_constraint(solver.internal_solver, model, vdata)
    @overrides
    def before_solve_mip(
        self,
        solver: "LearningSolver",
        instance: Instance,
        model: Any,
        stats: LearningSolveStats,
        sample: Sample,
    ) -> None:
        self.lazy_enforced.clear()
        logger.info("Predicting violated (dynamic) lazy constraints...")
        vnames = self.dynamic.sample_predict(instance, sample)
        violations = {c: self.dynamic.known_violations[c] for c in vnames}
        logger.info("Enforcing %d lazy constraints..." % len(vnames))
        self.enforce(violations, instance, model, solver)
        self.n_iterations = 0
    @overrides
    def after_solve_mip(
        self,
        solver: "LearningSolver",
        instance: Instance,
        model: Any,
        stats: LearningSolveStats,
        sample: Sample,
    ) -> None:
        sample.put_scalar("mip_constr_lazy", self.dynamic.encode(self.lazy_enforced))
        stats["LazyDynamic: Added in callback"] = len(self.lazy_enforced)
        stats["LazyDynamic: Iterations"] = self.n_iterations
    @overrides
    def iteration_cb(
        self,
        solver: "LearningSolver",
        instance: Instance,
        model: Any,
    ) -> bool:
        assert solver.internal_solver is not None
        logger.debug("Finding violated lazy constraints...")
        violations = instance.find_violated_lazy_constraints(
            solver.internal_solver,
            model,
        )
        if len(violations) == 0:
            logger.debug("No violations found")
            return False
        else:
            self.n_iterations += 1
            for v in violations:
                self.lazy_enforced[v] = violations[v]
            logger.debug("    %d violations found" % len(violations))
            self.enforce(violations, instance, model, solver)
            return True
    # Delegate ML methods to self.dynamic
    # -------------------------------------------------------------------
    @overrides
    def sample_xy(
        self,
        instance: Optional[Instance],
        sample: Sample,
    ) -> Tuple[Dict, Dict]:
        return self.dynamic.sample_xy(instance, sample)
    @overrides
    def pre_fit(self, pre: List[Any]) -> None:
        self.dynamic.pre_fit(pre)
    def sample_predict(
        self,
        instance: Instance,
        sample: Sample,
    ) -> List[ConstraintName]:
        return self.dynamic.sample_predict(instance, sample)
    @overrides
    def pre_sample_xy(self, instance: Instance, sample: Sample) -> Any:
        return self.dynamic.pre_sample_xy(instance, sample)
    @overrides
    def fit_xy(
        self,
        x: Dict[ConstraintCategory, np.ndarray],
        y: Dict[ConstraintCategory, np.ndarray],
    ) -> None:
        self.dynamic.fit_xy(x, y)
    @overrides
    def sample_evaluate(
        self,
        instance: Instance,
        sample: Sample,
    ) -> Dict[ConstraintCategory, Dict[str, float]]:
        return self.dynamic.sample_evaluate(instance, sample)
    # ------------------------------------------------------------------------------------------------------------------
    # NEW API
    # ------------------------------------------------------------------------------------------------------------------
    @staticmethod
    def extract(filenames, progress=True, known_cids=None):
        enforced_cids, features = [], []
        freeze_known_cids = True
        if known_cids is None:
            known_cids = set()
            freeze_known_cids = False
        for filename in tqdm(
            filenames,
            desc="extract (1/2)",
            disable=not progress,
        ):
            with Hdf5Sample(filename, mode="r") as sample:
                features.append(sample.get_array("lp_var_values"))
                cids = frozenset(
                    DynamicConstraintsComponent.decode(
                        sample.get_scalar("mip_constr_lazy")
                    ).keys()
                )
                enforced_cids.append(cids)
                if not freeze_known_cids:
                    known_cids.update(cids)
        x, y, cat, cdata = [], [], [], {}
        for (j, cid) in enumerate(known_cids):
            cdata[cid] = json.loads(cid.decode())
            for i in range(len(features)):
                cat.append(cid)
                x.append(features[i])
                if cid in enforced_cids[i]:
                    y.append([0, 1])
                else:
                    y.append([1, 0])
        x = np.vstack(x)
        y = np.vstack(y)
        cat = np.array(cat)
        x_dict, y_dict = DynamicLazyConstraintsComponent._split(
            x,
            y,
            cat,
            progress=progress,
        )
        return x_dict, y_dict, cdata
    @staticmethod
    def _split(x, y, cat, progress=False):
        # Sort data by categories
        pi = np.argsort(cat, kind="stable")
        x = x[pi]
        y = y[pi]
        cat = cat[pi]
        # Split categories
        x_dict = {}
        y_dict = {}
        start = 0
        for end in tqdm(
            range(len(cat) + 1),
            desc="extract (2/2)",
            disable=not progress,
        ):
            if (end >= len(cat)) or (cat[start] != cat[end]):
                x_dict[cat[start]] = x[start:end, :]
                y_dict[cat[start]] = y[start:end, :]
                start = end
        return x_dict, y_dict
--- a/miplearn/components/dynamic_user_cuts.py
+++ b/miplearn/components/dynamic_user_cuts.py
@@ -1,133 +0,0 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2021, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 import logging
 from typing import Any, TYPE_CHECKING, Tuple, Dict, List
 import numpy as np
 from overrides import overrides
 from miplearn.classifiers import Classifier
 from miplearn.classifiers.counting import CountingClassifier
 from miplearn.classifiers.threshold import Threshold, MinProbabilityThreshold
 from miplearn.components.component import Component
 from miplearn.components.dynamic_common import DynamicConstraintsComponent
 from miplearn.features.sample import Sample
 from miplearn.instance.base import Instance
 from miplearn.types import LearningSolveStats, ConstraintName, ConstraintCategory
 logger = logging.getLogger(__name__)
 if TYPE_CHECKING:
    from miplearn.solvers.learning import LearningSolver
 class UserCutsComponent(Component):
    def __init__(
        self,
        classifier: Classifier = CountingClassifier(),
        threshold: Threshold = MinProbabilityThreshold([0.50, 0.50]),
    ) -> None:
        self.dynamic = DynamicConstraintsComponent(
            classifier=classifier,
            threshold=threshold,
            attr="mip_user_cuts",
        )
        self.enforced: Dict[ConstraintName, Any] = {}
        self.n_added_in_callback = 0
    @overrides
    def before_solve_mip(
        self,
        solver: "LearningSolver",
        instance: "Instance",
        model: Any,
        stats: LearningSolveStats,
        sample: Sample,
    ) -> None:
        assert solver.internal_solver is not None
        self.enforced.clear()
        self.n_added_in_callback = 0
        logger.info("Predicting violated user cuts...")
        vnames = self.dynamic.sample_predict(instance, sample)
        logger.info("Enforcing %d user cuts ahead-of-time..." % len(vnames))
        for vname in vnames:
            vdata = self.dynamic.known_violations[vname]
            instance.enforce_user_cut(solver.internal_solver, model, vdata)
        stats["UserCuts: Added ahead-of-time"] = len(vnames)
    @overrides
    def user_cut_cb(
        self,
        solver: "LearningSolver",
        instance: "Instance",
        model: Any,
    ) -> None:
        assert solver.internal_solver is not None
        logger.debug("Finding violated user cuts...")
        violations = instance.find_violated_user_cuts(model)
        logger.debug(f"Found {len(violations)} violated user cuts")
        logger.debug("Building violated user cuts...")
        for (vname, vdata) in violations.items():
            if vname in self.enforced:
                continue
            instance.enforce_user_cut(solver.internal_solver, model, vdata)
            self.enforced[vname] = vdata
            self.n_added_in_callback += 1
        if len(violations) > 0:
            logger.debug(f"Added {len(violations)} violated user cuts")
    @overrides
    def after_solve_mip(
        self,
        solver: "LearningSolver",
        instance: "Instance",
        model: Any,
        stats: LearningSolveStats,
        sample: Sample,
    ) -> None:
        sample.put_scalar("mip_user_cuts", self.dynamic.encode(self.enforced))
        stats["UserCuts: Added in callback"] = self.n_added_in_callback
        if self.n_added_in_callback > 0:
            logger.info(f"{self.n_added_in_callback} user cuts added in callback")
    # Delegate ML methods to self.dynamic
    # -------------------------------------------------------------------
    @overrides
    def sample_xy(
        self,
        instance: "Instance",
        sample: Sample,
    ) -> Tuple[Dict, Dict]:
        return self.dynamic.sample_xy(instance, sample)
    @overrides
    def pre_fit(self, pre: List[Any]) -> None:
        self.dynamic.pre_fit(pre)
    def sample_predict(
        self,
        instance: "Instance",
        sample: Sample,
    ) -> List[ConstraintName]:
        return self.dynamic.sample_predict(instance, sample)
    @overrides
    def pre_sample_xy(self, instance: Instance, sample: Sample) -> Any:
        return self.dynamic.pre_sample_xy(instance, sample)
    @overrides
    def fit_xy(
        self,
        x: Dict[ConstraintCategory, np.ndarray],
        y: Dict[ConstraintCategory, np.ndarray],
    ) -> None:
        self.dynamic.fit_xy(x, y)
    @overrides
    def sample_evaluate(
        self,
        instance: "Instance",
        sample: Sample,
    ) -> Dict[ConstraintCategory, Dict[ConstraintName, float]]:
        return self.dynamic.sample_evaluate(instance, sample)
--- a/miplearn/components/lazy.py
+++ b/miplearn/components/lazy.py
@@ -0,0 +1,43 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2022, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 import json
 from typing import Any, Dict, List
 import gurobipy as gp
 from ..h5 import H5File
 class ExpertLazyComponent:
    def __init__(self) -> None:
        pass
    def fit(self, train_h5: List[str]) -> None:
        pass
    def before_mip(self, test_h5: str, model: gp.Model, stats: Dict[str, Any]) -> None:
        with H5File(test_h5, "r") as h5:
            constr_names = h5.get_array("static_constr_names")
            constr_lazy = h5.get_array("mip_constr_lazy")
            constr_violations = h5.get_scalar("mip_constr_violations")
            assert constr_names is not None
            assert constr_violations is not None
            # Static lazy constraints
            n_static_lazy = 0
            if constr_lazy is not None:
                for (constr_idx, constr_name) in enumerate(constr_names):
                    if constr_lazy[constr_idx]:
                        constr = model.getConstrByName(constr_name.decode())
                        constr.lazy = 3
                        n_static_lazy += 1
            stats.update({"Static lazy constraints": n_static_lazy})
            # Dynamic lazy constraints
            if hasattr(model, "_fix_violations"):
                violations = json.loads(constr_violations)
                model._fix_violations(model, violations, "aot")
                stats.update({"Dynamic lazy constraints": len(violations)})
--- a/miplearn/components/objective.py
+++ b/miplearn/components/objective.py
@@ -1,126 +0,0 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2021, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 import logging
 from typing import List, Dict, Any, TYPE_CHECKING, Tuple, Optional, cast
 import numpy as np
 from overrides import overrides
 from sklearn.linear_model import LinearRegression
 from miplearn.classifiers import Regressor
 from miplearn.classifiers.sklearn import ScikitLearnRegressor
 from miplearn.components.component import Component
 from miplearn.features.sample import Sample
 from miplearn.instance.base import Instance
 from miplearn.types import LearningSolveStats
 if TYPE_CHECKING:
    from miplearn.solvers.learning import LearningSolver
 logger = logging.getLogger(__name__)
 class ObjectiveValueComponent(Component):
    """
    A Component which predicts the optimal objective value of the problem.
    """
    def __init__(
        self,
        regressor: Regressor = ScikitLearnRegressor(LinearRegression()),
    ) -> None:
        assert isinstance(regressor, Regressor)
        self.regressors: Dict[str, Regressor] = {}
        self.regressor_prototype = regressor
    @overrides
    def before_solve_mip(
        self,
        solver: "LearningSolver",
        instance: Instance,
        model: Any,
        stats: LearningSolveStats,
        sample: Sample,
    ) -> None:
        logger.info("Predicting optimal value...")
        pred = self.sample_predict(sample)
        for (c, v) in pred.items():
            logger.info(f"Predicted {c.lower()}: %.6e" % v)
            stats[f"Objective: Predicted {c.lower()}"] = v  # type: ignore
    @overrides
    def fit_xy(
        self,
        x: Dict[str, np.ndarray],
        y: Dict[str, np.ndarray],
    ) -> None:
        for c in ["Upper bound", "Lower bound"]:
            if c in y:
                self.regressors[c] = self.regressor_prototype.clone()
                self.regressors[c].fit(x[c], y[c])
    def sample_predict(self, sample: Sample) -> Dict[str, float]:
        pred: Dict[str, float] = {}
        x, _ = self.sample_xy(None, sample)
        for c in ["Upper bound", "Lower bound"]:
            if c in self.regressors is not None:
                pred[c] = self.regressors[c].predict(np.array(x[c]))[0, 0]
            else:
                logger.info(f"{c} regressor not fitted. Skipping.")
        return pred
    @overrides
    def sample_xy(
        self,
        _: Optional[Instance],
        sample: Sample,
    ) -> Tuple[Dict[str, List[List[float]]], Dict[str, List[List[float]]]]:
        lp_instance_features_np = sample.get_array("lp_instance_features")
        if lp_instance_features_np is None:
            lp_instance_features_np = sample.get_array("static_instance_features")
        assert lp_instance_features_np is not None
        lp_instance_features = cast(List[float], lp_instance_features_np.tolist())
        # Features
        x: Dict[str, List[List[float]]] = {
            "Upper bound": [lp_instance_features],
            "Lower bound": [lp_instance_features],
        }
        # Labels
        y: Dict[str, List[List[float]]] = {}
        mip_lower_bound = sample.get_scalar("mip_lower_bound")
        mip_upper_bound = sample.get_scalar("mip_upper_bound")
        if mip_lower_bound is not None:
            y["Lower bound"] = [[mip_lower_bound]]
        if mip_upper_bound is not None:
            y["Upper bound"] = [[mip_upper_bound]]
        return x, y
    @overrides
    def sample_evaluate(
        self,
        instance: Instance,
        sample: Sample,
    ) -> Dict[str, Dict[str, float]]:
        def compare(y_pred: float, y_actual: float) -> Dict[str, float]:
            err = np.round(abs(y_pred - y_actual), 8)
            return {
                "Actual value": y_actual,
                "Predicted value": y_pred,
                "Absolute error": err,
                "Relative error": err / y_actual,
            }
        result: Dict[str, Dict[str, float]] = {}
        pred = self.sample_predict(sample)
        actual_ub = sample.get_scalar("mip_upper_bound")
        actual_lb = sample.get_scalar("mip_lower_bound")
        if actual_ub is not None:
            result["Upper bound"] = compare(pred["Upper bound"], actual_ub)
        if actual_lb is not None:
            result["Lower bound"] = compare(pred["Lower bound"], actual_lb)
        return result
--- a/miplearn/components/primal.py
+++ b/miplearn/components/primal.py
@@ -1,341 +0,0 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2021, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 import logging
 from typing import Dict, List, Any, TYPE_CHECKING, Tuple, Optional
 import numpy as np
 from overrides import overrides
 from miplearn.classifiers import Classifier
 from miplearn.classifiers.adaptive import AdaptiveClassifier
 from miplearn.classifiers.threshold import MinPrecisionThreshold, Threshold
 from miplearn.components import classifier_evaluation_dict
 from miplearn.components.component import Component
 from miplearn.features.sample import Sample
 from miplearn.instance.base import Instance
 from miplearn.types import (
    LearningSolveStats,
    Category,
    Solution,
 )
 from miplearn.features.sample import Hdf5Sample
 from p_tqdm import p_map
 from tqdm.auto import tqdm
 logger = logging.getLogger(__name__)
 if TYPE_CHECKING:
    from miplearn.solvers.learning import LearningSolver
 class PrimalSolutionComponent(Component):
    """
    A component that predicts the optimal primal values for the binary decision
    variables.
    In exact mode, predicted primal solutions are provided to the solver as MIP
    starts. In heuristic mode, this component fixes the decision variables to their
    predicted values.
    """
    def __init__(
        self,
        classifier: Classifier = AdaptiveClassifier(),
        mode: str = "exact",
        threshold: Threshold = MinPrecisionThreshold([0.99, 0.99]),
    ) -> None:
        assert isinstance(classifier, Classifier)
        assert isinstance(threshold, Threshold)
        assert mode in ["exact", "heuristic"]
        self.mode = mode
        self.classifiers: Dict[Category, Classifier] = {}
        self.thresholds: Dict[Category, Threshold] = {}
        self.threshold_prototype = threshold
        self.classifier_prototype = classifier
    @overrides
    def before_solve_mip(
        self,
        solver: "LearningSolver",
        instance: Instance,
        model: Any,
        stats: LearningSolveStats,
        sample: Sample,
    ) -> None:
        logger.info("Predicting primal solution...")
        # Do nothing if models are not trained
        if len(self.classifiers) == 0:
            logger.info("Classifiers not fitted. Skipping.")
            return
        # Predict solution and provide it to the solver
        solution = self.sample_predict(sample)
        assert solver.internal_solver is not None
        if self.mode == "heuristic":
            solver.internal_solver.fix(solution)
        else:
            solver.internal_solver.set_warm_start(solution)
        # Update statistics
        stats["Primal: Free"] = 0
        stats["Primal: Zero"] = 0
        stats["Primal: One"] = 0
        for (var_name, value) in solution.items():
            if value is None:
                stats["Primal: Free"] += 1
            else:
                if value < 0.5:
                    stats["Primal: Zero"] += 1
                else:
                    stats["Primal: One"] += 1
        logger.info(
            f"Predicted: free: {stats['Primal: Free']}, "
            f"zero: {stats['Primal: Zero']}, "
            f"one: {stats['Primal: One']}"
        )
    def sample_predict(self, sample: Sample) -> Solution:
        var_names = sample.get_array("static_var_names")
        var_categories = sample.get_array("static_var_categories")
        var_types = sample.get_array("static_var_types")
        assert var_names is not None
        assert var_categories is not None
        assert var_types is not None
        # Compute y_pred
        x, _ = self.sample_xy(None, sample)
        y_pred = {}
        for category in x.keys():
            assert category in self.classifiers, (
                f"Classifier for category {category} has not been trained. "
                f"Please call component.fit before component.predict."
            )
            xc = np.array(x[category])
            proba = self.classifiers[category].predict_proba(xc)
            thr = self.thresholds[category].predict(xc)
            y_pred[category] = np.vstack(
                [
                    proba[:, 0] >= thr[0],
                    proba[:, 1] >= thr[1],
                ]
            ).T
        # Convert y_pred into solution
        solution: Solution = {v: None for v in var_names}
        category_offset: Dict[Category, int] = {cat: 0 for cat in x.keys()}
        for (i, var_name) in enumerate(var_names):
            if var_types[i] != b"B":
                continue
            category = var_categories[i]
            if category not in category_offset:
                continue
            offset = category_offset[category]
            category_offset[category] += 1
            if y_pred[category][offset, 0]:
                solution[var_name] = 0.0
            if y_pred[category][offset, 1]:
                solution[var_name] = 1.0
        return solution
    @overrides
    def sample_xy(
        self,
        _: Optional[Instance],
        sample: Sample,
    ) -> Tuple[Dict[Category, List[List[float]]], Dict[Category, List[List[float]]]]:
        x: Dict = {}
        y: Dict = {}
        instance_features = sample.get_array("static_instance_features")
        mip_var_values = sample.get_array("mip_var_values")
        lp_var_values = sample.get_array("lp_var_values")
        var_features = sample.get_array("lp_var_features")
        var_names = sample.get_array("static_var_names")
        var_types = sample.get_array("static_var_types")
        var_categories = sample.get_array("static_var_categories")
        if var_features is None:
            var_features = sample.get_array("static_var_features")
        assert instance_features is not None
        assert var_features is not None
        assert var_names is not None
        assert var_types is not None
        assert var_categories is not None
        for (i, var_name) in enumerate(var_names):
            # Skip non-binary variables
            if var_types[i] != b"B":
                continue
            # Initialize categories
            category = var_categories[i]
            if len(category) == 0:
                continue
            if category not in x.keys():
                x[category] = []
                y[category] = []
            # Features
            features = list(instance_features)
            features.extend(var_features[i])
            if lp_var_values is not None:
                features.extend(lp_var_values)
            x[category].append(features)
            # Labels
            if mip_var_values is not None:
                opt_value = mip_var_values[i]
                assert opt_value is not None
                y[category].append([opt_value < 0.5, opt_value >= 0.5])
        return x, y
    @overrides
    def sample_evaluate(
        self,
        _: Optional[Instance],
        sample: Sample,
    ) -> Dict[str, Dict[str, float]]:
        mip_var_values = sample.get_array("mip_var_values")
        var_names = sample.get_array("static_var_names")
        assert mip_var_values is not None
        assert var_names is not None
        solution_actual = {
            var_name: mip_var_values[i] for (i, var_name) in enumerate(var_names)
        }
        solution_pred = self.sample_predict(sample)
        vars_all, vars_one, vars_zero = set(), set(), set()
        pred_one_positive, pred_zero_positive = set(), set()
        for (var_name, value_actual) in solution_actual.items():
            vars_all.add(var_name)
            if value_actual > 0.5:
                vars_one.add(var_name)
            else:
                vars_zero.add(var_name)
            value_pred = solution_pred[var_name]
            if value_pred is not None:
                if value_pred > 0.5:
                    pred_one_positive.add(var_name)
                else:
                    pred_zero_positive.add(var_name)
        pred_one_negative = vars_all - pred_one_positive
        pred_zero_negative = vars_all - pred_zero_positive
        return {
            "0": classifier_evaluation_dict(
                tp=len(pred_zero_positive & vars_zero),
                tn=len(pred_zero_negative & vars_one),
                fp=len(pred_zero_positive & vars_one),
                fn=len(pred_zero_negative & vars_zero),
            ),
            "1": classifier_evaluation_dict(
                tp=len(pred_one_positive & vars_one),
                tn=len(pred_one_negative & vars_zero),
                fp=len(pred_one_positive & vars_zero),
                fn=len(pred_one_negative & vars_one),
            ),
        }
    @overrides
    def fit_xy(
        self,
        x: Dict[Category, np.ndarray],
        y: Dict[Category, np.ndarray],
        progress: bool = False,
    ) -> None:
        for category in tqdm(x.keys(), desc="fit", disable=not progress):
            clf = self.classifier_prototype.clone()
            thr = self.threshold_prototype.clone()
            clf.fit(x[category], y[category])
            thr.fit(clf, x[category], y[category])
            self.classifiers[category] = clf
            self.thresholds[category] = thr
    # ------------------------------------------------------------------------------------------------------------------
    # NEW API
    # ------------------------------------------------------------------------------------------------------------------
    def fit(
        self,
        x: Dict[Category, np.ndarray],
        y: Dict[Category, np.ndarray],
        progress: bool = False,
    ) -> None:
        for category in tqdm(x.keys(), desc="fit", disable=not progress):
            clf = self.classifier_prototype.clone()
            thr = self.threshold_prototype.clone()
            clf.fit(x[category], y[category])
            thr.fit(clf, x[category], y[category])
            self.classifiers[category] = clf
            self.thresholds[category] = thr
    def predict(self, x):
        y_pred = {}
        for category in x.keys():
            assert category in self.classifiers, (
                f"Classifier for category {category} has not been trained. "
                f"Please call component.fit before component.predict."
            )
            xc = np.array(x[category])
            proba = self.classifiers[category].predict_proba(xc)
            thr = self.thresholds[category].predict(xc)
            y_pred[category] = np.vstack(
                [
                    proba[:, 0] >= thr[0],
                    proba[:, 1] >= thr[1],
                ]
            ).T
        return y_pred
    @staticmethod
    def extract(
        filenames: List[str],
        progress: bool = False,
    ):
        x, y, cat = [], [], []
        # Read data
        for filename in tqdm(
            filenames,
            desc="extract (1/2)",
            disable=not progress,
        ):
            with Hdf5Sample(filename, mode="r") as sample:
                mip_var_values = sample.get_array("mip_var_values")
                var_features = sample.get_array("lp_var_features")
                var_types = sample.get_array("static_var_types")
                var_categories = sample.get_array("static_var_categories")
                assert var_features is not None
                assert var_types is not None
                assert var_categories is not None
                x.append(var_features)
                y.append([mip_var_values < 0.5, mip_var_values > 0.5])
                cat.extend(var_categories)
        # Convert to numpy arrays
        x = np.vstack(x)
        y = np.hstack(y).T
        cat = np.array(cat)
        # Sort data by categories
        pi = np.argsort(cat, kind="stable")
        x = x[pi]
        y = y[pi]
        cat = cat[pi]
        # Split categories
        x_dict = {}
        y_dict = {}
        start = 0
        for end in tqdm(
            range(len(cat) + 1),
            desc="extract (2/2)",
            disable=not progress,
        ):
            if (end >= len(cat)) or (cat[start] != cat[end]):
                x_dict[cat[start]] = x[start:end, :]
                y_dict[cat[start]] = y[start:end, :]
                start = end
        return x_dict, y_dict
--- a/miplearn/components/primal/init.py
+++ b/miplearn/components/primal/init.py
@@ -0,0 +1,29 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2022, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 from typing import Tuple
 import numpy as np
 from miplearn.h5 import H5File
 def _extract_bin_var_names_values(
    h5: H5File,
 ) -> Tuple[np.ndarray, np.ndarray, np.ndarray]:
    bin_var_names, bin_var_indices = _extract_bin_var_names(h5)
    var_values = h5.get_array("mip_var_values")
    assert var_values is not None
    bin_var_values = var_values[bin_var_indices].astype(int)
    return bin_var_names, bin_var_values, bin_var_indices
 def _extract_bin_var_names(h5: H5File) -> Tuple[np.ndarray, np.ndarray]:
    var_types = h5.get_array("static_var_types")
    var_names = h5.get_array("static_var_names")
    assert var_types is not None
    assert var_names is not None
    bin_var_indices = np.where(var_types == b"B")[0]
    bin_var_names = var_names[bin_var_indices]
    assert len(bin_var_names.shape) == 1
    return bin_var_names, bin_var_indices
--- a/miplearn/components/primal/actions.py
+++ b/miplearn/components/primal/actions.py
@@ -0,0 +1,93 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2022, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 import logging
 from abc import ABC, abstractmethod
 from typing import Optional, Dict
 import numpy as np
 from miplearn.solvers.abstract import AbstractModel
 logger = logging.getLogger()
 class PrimalComponentAction(ABC):
    @abstractmethod
    def perform(
        self,
        model: AbstractModel,
        var_names: np.ndarray,
        var_values: np.ndarray,
        stats: Optional[Dict],
    ) -> None:
        pass
 class SetWarmStart(PrimalComponentAction):
    def perform(
        self,
        model: AbstractModel,
        var_names: np.ndarray,
        var_values: np.ndarray,
        stats: Optional[Dict],
    ) -> None:
        logger.info("Setting warm starts...")
        model.set_warm_starts(var_names, var_values, stats)
 class FixVariables(PrimalComponentAction):
    def perform(
        self,
        model: AbstractModel,
        var_names: np.ndarray,
        var_values: np.ndarray,
        stats: Optional[Dict],
    ) -> None:
        logger.info("Fixing variables...")
        assert len(var_values.shape) == 2
        assert var_values.shape[0] == 1
        var_values = var_values.reshape(-1)
        model.fix_variables(var_names, var_values, stats)
        if stats is not None:
            stats["Heuristic"] = True
 class EnforceProximity(PrimalComponentAction):
    def __init__(self, tol: float) -> None:
        self.tol = tol
    def perform(
        self,
        model: AbstractModel,
        var_names: np.ndarray,
        var_values: np.ndarray,
        stats: Optional[Dict],
    ) -> None:
        assert len(var_values.shape) == 2
        assert var_values.shape[0] == 1
        var_values = var_values.reshape(-1)
        constr_lhs = []
        constr_vars = []
        constr_rhs = 0.0
        for (i, var_name) in enumerate(var_names):
            if np.isnan(var_values[i]):
                continue
            constr_lhs.append(1.0 if var_values[i] < 0.5 else -1.0)
            constr_rhs -= var_values[i]
            constr_vars.append(var_name)
        constr_rhs += len(constr_vars) * self.tol
        logger.info(
            f"Adding proximity constraint (tol={self.tol}, nz={len(constr_vars)})..."
        )
        model.add_constrs(
            np.array(constr_vars),
            np.array([constr_lhs]),
            np.array(["<"], dtype="S"),
            np.array([constr_rhs]),
        )
        if stats is not None:
            stats["Heuristic"] = True
--- a/miplearn/components/primal/expert.py
+++ b/miplearn/components/primal/expert.py
@@ -0,0 +1,32 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2022, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 import logging
 from typing import Any, Dict, List
 from . import _extract_bin_var_names_values
 from .actions import PrimalComponentAction
 from ...solvers.abstract import AbstractModel
 from ...h5 import H5File
 logger = logging.getLogger(__name__)
 class ExpertPrimalComponent:
    def __init__(self, action: PrimalComponentAction):
        self.action = action
    """
    Component that predicts warm starts by peeking at the optimal solution.
    """
    def fit(self, train_h5: List[str]) -> None:
        pass
    def before_mip(
        self, test_h5: str, model: AbstractModel, stats: Dict[str, Any]
    ) -> None:
        with H5File(test_h5, "r") as h5:
            names, values, _ = _extract_bin_var_names_values(h5)
            self.action.perform(model, names, values.reshape(1, -1), stats)
--- a/miplearn/components/primal/indep.py
+++ b/miplearn/components/primal/indep.py
@@ -0,0 +1,129 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2022, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 import logging
 from typing import Any, Dict, List, Callable, Optional
 import numpy as np
 import sklearn
 from miplearn.components.primal import (
    _extract_bin_var_names_values,
    _extract_bin_var_names,
 )
 from miplearn.components.primal.actions import PrimalComponentAction
 from miplearn.extractors.abstract import FeaturesExtractor
 from miplearn.solvers.abstract import AbstractModel
 from miplearn.h5 import H5File
 logger = logging.getLogger(__name__)
 class IndependentVarsPrimalComponent:
    def __init__(
        self,
        base_clf: Any,
        extractor: FeaturesExtractor,
        action: PrimalComponentAction,
        clone_fn: Callable[[Any], Any] = sklearn.clone,
    ):
        self.base_clf = base_clf
        self.extractor = extractor
        self.clf_: Dict[bytes, Any] = {}
        self.bin_var_names_: Optional[np.ndarray] = None
        self.n_features_: Optional[int] = None
        self.clone_fn = clone_fn
        self.action = action
    def fit(self, train_h5: List[str]) -> None:
        logger.info("Reading training data...")
        self.bin_var_names_ = None
        n_bin_vars: Optional[int] = None
        n_vars: Optional[int] = None
        x, y = [], []
        for h5_filename in train_h5:
            with H5File(h5_filename, "r") as h5:
                # Get number of variables
                var_types = h5.get_array("static_var_types")
                assert var_types is not None
                n_vars = len(var_types)
                # Extract features
                (
                    bin_var_names,
                    bin_var_values,
                    bin_var_indices,
                ) = _extract_bin_var_names_values(h5)
                # Store/check variable names
                if self.bin_var_names_ is None:
                    self.bin_var_names_ = bin_var_names
                    n_bin_vars = len(self.bin_var_names_)
                else:
                    assert np.all(bin_var_names == self.bin_var_names_)
                # Build x and y vectors
                x_sample = self.extractor.get_var_features(h5)
                assert len(x_sample.shape) == 2
                assert x_sample.shape[0] == n_vars
                x_sample = x_sample[bin_var_indices]
                if self.n_features_ is None:
                    self.n_features_ = x_sample.shape[1]
                else:
                    assert x_sample.shape[1] == self.n_features_
                x.append(x_sample)
                y.append(bin_var_values)
        assert n_bin_vars is not None
        assert self.bin_var_names_ is not None
        logger.info("Constructing matrices...")
        x_np = np.vstack(x)
        y_np = np.hstack(y)
        n_samples = len(train_h5) * n_bin_vars
        assert x_np.shape == (n_samples, self.n_features_)
        assert y_np.shape == (n_samples,)
        logger.info(
            f"Dataset has {n_bin_vars} binary variables, "
            f"{len(train_h5):,d} samples per variable, "
            f"{self.n_features_:,d} features, 1 target and 2 classes"
        )
        logger.info(f"Training {n_bin_vars} classifiers...")
        self.clf_ = {}
        for (var_idx, var_name) in enumerate(self.bin_var_names_):
            self.clf_[var_name] = self.clone_fn(self.base_clf)
            self.clf_[var_name].fit(
                x_np[var_idx::n_bin_vars, :], y_np[var_idx::n_bin_vars]
            )
        logger.info("Done fitting.")
    def before_mip(
        self, test_h5: str, model: AbstractModel, stats: Dict[str, Any]
    ) -> None:
        assert self.bin_var_names_ is not None
        assert self.n_features_ is not None
        # Read features
        with H5File(test_h5, "r") as h5:
            x_sample = self.extractor.get_var_features(h5)
            bin_var_names, bin_var_indices = _extract_bin_var_names(h5)
            assert np.all(bin_var_names == self.bin_var_names_)
            x_sample = x_sample[bin_var_indices]
        assert x_sample.shape == (len(self.bin_var_names_), self.n_features_)
        # Predict optimal solution
        logger.info("Predicting warm starts...")
        y_pred = []
        for (var_idx, var_name) in enumerate(self.bin_var_names_):
            x_var = x_sample[var_idx, :].reshape(1, -1)
            y_var = self.clf_[var_name].predict(x_var)
            assert y_var.shape == (1,)
            y_pred.append(y_var[0])
        # Construct warm starts, based on prediction
        y_pred_np = np.array(y_pred).reshape(1, -1)
        assert y_pred_np.shape == (1, len(self.bin_var_names_))
        self.action.perform(model, self.bin_var_names_, y_pred_np, stats)
--- a/miplearn/components/primal/joint.py
+++ b/miplearn/components/primal/joint.py
@@ -0,0 +1,88 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2022, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 import logging
 from typing import List, Dict, Any, Optional
 import numpy as np
 from miplearn.components.primal import _extract_bin_var_names_values
 from miplearn.components.primal.actions import PrimalComponentAction
 from miplearn.extractors.abstract import FeaturesExtractor
 from miplearn.solvers.abstract import AbstractModel
 from miplearn.h5 import H5File
 logger = logging.getLogger(__name__)
 class JointVarsPrimalComponent:
    def __init__(
        self, clf: Any, extractor: FeaturesExtractor, action: PrimalComponentAction
    ):
        self.clf = clf
        self.extractor = extractor
        self.bin_var_names_: Optional[np.ndarray] = None
        self.action = action
    def fit(self, train_h5: List[str]) -> None:
        logger.info("Reading training data...")
        self.bin_var_names_ = None
        x, y, n_samples, n_features = [], [], len(train_h5), None
        for h5_filename in train_h5:
            with H5File(h5_filename, "r") as h5:
                bin_var_names, bin_var_values, _ = _extract_bin_var_names_values(h5)
                # Store/check variable names
                if self.bin_var_names_ is None:
                    self.bin_var_names_ = bin_var_names
                else:
                    assert np.all(bin_var_names == self.bin_var_names_)
                # Build x and y vectors
                x_sample = self.extractor.get_instance_features(h5)
                assert len(x_sample.shape) == 1
                if n_features is None:
                    n_features = len(x_sample)
                else:
                    assert len(x_sample) == n_features
                x.append(x_sample)
                y.append(bin_var_values)
        assert self.bin_var_names_ is not None
        logger.info("Constructing matrices...")
        x_np = np.vstack(x)
        y_np = np.array(y)
        assert len(x_np.shape) == 2
        assert x_np.shape[0] == n_samples
        assert x_np.shape[1] == n_features
        assert y_np.shape == (n_samples, len(self.bin_var_names_))
        logger.info(
            f"Dataset has {n_samples:,d} samples, "
            f"{n_features:,d} features and {y_np.shape[1]:,d} targets"
        )
        logger.info("Training classifier...")
        self.clf.fit(x_np, y_np)
        logger.info("Done fitting.")
    def before_mip(
        self, test_h5: str, model: AbstractModel, stats: Dict[str, Any]
    ) -> None:
        assert self.bin_var_names_ is not None
        # Read features
        with H5File(test_h5, "r") as h5:
            x_sample = self.extractor.get_instance_features(h5)
        assert len(x_sample.shape) == 1
        x_sample = x_sample.reshape(1, -1)
        # Predict optimal solution
        logger.info("Predicting warm starts...")
        y_pred = self.clf.predict(x_sample)
        assert len(y_pred.shape) == 2
        assert y_pred.shape[0] == 1
        assert y_pred.shape[1] == len(self.bin_var_names_)
        # Construct warm starts, based on prediction
        self.action.perform(model, self.bin_var_names_, y_pred, stats)
--- a/miplearn/components/primal/mem.py
+++ b/miplearn/components/primal/mem.py
@@ -0,0 +1,167 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2022, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 import logging
 from abc import ABC, abstractmethod
 from typing import List, Dict, Any, Optional, Tuple
 import numpy as np
 from . import _extract_bin_var_names_values
 from .actions import PrimalComponentAction
 from ...extractors.abstract import FeaturesExtractor
 from ...solvers.abstract import AbstractModel
 from ...h5 import H5File
 logger = logging.getLogger()
 class SolutionConstructor(ABC):
    @abstractmethod
    def construct(self, y_proba: np.ndarray, solutions: np.ndarray) -> np.ndarray:
        pass
 class MemorizingPrimalComponent:
    """
    Component that memorizes all solutions seen during training, then fits a
    single classifier to predict which of the memorized solutions should be
    provided to the solver. Optionally combines multiple memorized solutions
    into a single, partial one.
    """
    def __init__(
        self,
        clf: Any,
        extractor: FeaturesExtractor,
        constructor: SolutionConstructor,
        action: PrimalComponentAction,
    ) -> None:
        assert clf is not None
        self.clf = clf
        self.extractor = extractor
        self.constructor = constructor
        self.solutions_: Optional[np.ndarray] = None
        self.bin_var_names_: Optional[np.ndarray] = None
        self.action = action
    def fit(self, train_h5: List[str]) -> None:
        logger.info("Reading training data...")
        n_samples = len(train_h5)
        solutions_ = []
        self.bin_var_names_ = None
        x, y, n_features = [], [], None
        solution_to_idx: Dict[Tuple, int] = {}
        for h5_filename in train_h5:
            with H5File(h5_filename, "r") as h5:
                bin_var_names, bin_var_values, _ = _extract_bin_var_names_values(h5)
                # Store/check variable names
                if self.bin_var_names_ is None:
                    self.bin_var_names_ = bin_var_names
                else:
                    assert np.all(bin_var_names == self.bin_var_names_)
                # Store solution
                sol = tuple(np.where(bin_var_values)[0])
                if sol not in solution_to_idx:
                    solutions_.append(bin_var_values)
                    solution_to_idx[sol] = len(solution_to_idx)
                y.append(solution_to_idx[sol])
                # Extract features
                x_sample = self.extractor.get_instance_features(h5)
                assert len(x_sample.shape) == 1
                if n_features is None:
                    n_features = len(x_sample)
                else:
                    assert len(x_sample) == n_features
                x.append(x_sample)
        logger.info("Constructing matrices...")
        x_np = np.vstack(x)
        y_np = np.array(y)
        assert len(x_np.shape) == 2
        assert x_np.shape[0] == n_samples
        assert x_np.shape[1] == n_features
        assert y_np.shape == (n_samples,)
        self.solutions_ = np.array(solutions_)
        n_classes = len(solution_to_idx)
        logger.info(
            f"Dataset has {n_samples:,d} samples, "
            f"{n_features:,d} features and {n_classes:,d} classes"
        )
        logger.info("Training classifier...")
        self.clf.fit(x_np, y_np)
        logger.info("Done fitting.")
    def before_mip(
        self, test_h5: str, model: AbstractModel, stats: Dict[str, Any]
    ) -> None:
        assert self.solutions_ is not None
        assert self.bin_var_names_ is not None
        # Read features
        with H5File(test_h5, "r") as h5:
            x_sample = self.extractor.get_instance_features(h5)
        assert len(x_sample.shape) == 1
        x_sample = x_sample.reshape(1, -1)
        # Predict optimal solution
        logger.info("Predicting primal solution...")
        y_proba = self.clf.predict_proba(x_sample)
        assert len(y_proba.shape) == 2
        assert y_proba.shape[0] == 1
        assert y_proba.shape[1] == len(self.solutions_)
        # Construct warm starts, based on prediction
        starts = self.constructor.construct(y_proba[0, :], self.solutions_)
        self.action.perform(model, self.bin_var_names_, starts, stats)
 class SelectTopSolutions(SolutionConstructor):
    """
    Warm start construction strategy that selects and returns the top k solutions.
    """
    def __init__(self, k: int) -> None:
        self.k = k
    def construct(self, y_proba: np.ndarray, solutions: np.ndarray) -> np.ndarray:
        # Check arguments
        assert len(y_proba.shape) == 1
        assert len(solutions.shape) == 2
        assert len(y_proba) == solutions.shape[0]
        # Select top k solutions
        ind = np.argsort(-y_proba, kind="stable")
        selected = ind[: min(self.k, len(ind))]
        return solutions[selected, :]
 class MergeTopSolutions(SolutionConstructor):
    """
    Warm start construction strategy that first selects the top k solutions,
    then merges them into a single solution.
    To merge the solutions, the strategy first computes the mean optimal value of each
    decision variable, then: (i) sets the variable to zero if the mean is below
    thresholds[0]; (ii) sets the variable to one if the mean is above thresholds[1];
    (iii) leaves the variable free otherwise.
    """
    def __init__(self, k: int, thresholds: List[float]):
        assert len(thresholds) == 2
        self.k = k
        self.thresholds = thresholds
    def construct(self, y_proba: np.ndarray, solutions: np.ndarray) -> np.ndarray:
        filtered = SelectTopSolutions(self.k).construct(y_proba, solutions)
        mean = filtered.mean(axis=0)
        start = np.full((1, solutions.shape[1]), float("nan"))
        start[0, mean <= self.thresholds[0]] = 0
        start[0, mean >= self.thresholds[1]] = 1
        return start
--- a/miplearn/components/priority.py
+++ b/miplearn/components/priority.py
@@ -0,0 +1,31 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2022, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 from math import log
 from typing import List, Dict, Any
 import numpy as np
 import gurobipy as gp
 from ..h5 import H5File
 class ExpertBranchPriorityComponent:
    def __init__(self) -> None:
        pass
    def fit(self, train_h5: List[str]) -> None:
        pass
    def before_mip(self, test_h5: str, model: gp.Model, _: Dict[str, Any]) -> None:
        with H5File(test_h5, "r") as h5:
            var_names = h5.get_array("static_var_names")
            var_priority = h5.get_array("bb_var_priority")
            assert var_priority is not None
            assert var_names is not None
            for (var_idx, var_name) in enumerate(var_names):
                if np.isfinite(var_priority[var_idx]):
                    var = model.getVarByName(var_name.decode())
                    var.branchPriority = int(log(1 + var_priority[var_idx]))
--- a/miplearn/components/static_lazy.py
+++ b/miplearn/components/static_lazy.py
@@ -1,252 +0,0 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2021, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 import logging
 from typing import Dict, Tuple, List, Any, TYPE_CHECKING, Set, Optional
 import numpy as np
 from overrides import overrides
 from miplearn.classifiers import Classifier
 from miplearn.classifiers.counting import CountingClassifier
 from miplearn.classifiers.threshold import MinProbabilityThreshold, Threshold
 from miplearn.components.component import Component
 from miplearn.features.sample import Sample
 from miplearn.solvers.internal import Constraints
 from miplearn.instance.base import Instance
 from miplearn.types import LearningSolveStats, ConstraintName, ConstraintCategory
 logger = logging.getLogger(__name__)
 if TYPE_CHECKING:
    from miplearn.solvers.learning import LearningSolver
 class LazyConstraint:
    def __init__(self, cid: ConstraintName, obj: Any) -> None:
        self.cid = cid
        self.obj = obj
 class StaticLazyConstraintsComponent(Component):
    """
    Component that decides which of the constraints tagged as lazy should
    be kept in the formulation, and which should be removed.
    """
    def __init__(
        self,
        classifier: Classifier = CountingClassifier(),
        threshold: Threshold = MinProbabilityThreshold([0.50, 0.50]),
        violation_tolerance: float = -0.5,
    ) -> None:
        assert isinstance(classifier, Classifier)
        self.classifier_prototype: Classifier = classifier
        self.threshold_prototype: Threshold = threshold
        self.classifiers: Dict[ConstraintCategory, Classifier] = {}
        self.thresholds: Dict[ConstraintCategory, Threshold] = {}
        self.pool: Constraints = Constraints()
        self.violation_tolerance: float = violation_tolerance
        self.enforced_cids: Set[ConstraintName] = set()
        self.n_restored: int = 0
        self.n_iterations: int = 0
    @overrides
    def after_solve_mip(
        self,
        solver: "LearningSolver",
        instance: "Instance",
        model: Any,
        stats: LearningSolveStats,
        sample: Sample,
    ) -> None:
        sample.put_array(
            "mip_constr_lazy_enforced",
            np.array(list(self.enforced_cids), dtype="S"),
        )
        stats["LazyStatic: Restored"] = self.n_restored
        stats["LazyStatic: Iterations"] = self.n_iterations
    @overrides
    def before_solve_mip(
        self,
        solver: "LearningSolver",
        instance: "Instance",
        model: Any,
        stats: LearningSolveStats,
        sample: Sample,
    ) -> None:
        assert solver.internal_solver is not None
        static_lazy_count = sample.get_scalar("static_constr_lazy_count")
        assert static_lazy_count is not None
        logger.info("Predicting violated (static) lazy constraints...")
        if static_lazy_count == 0:
            logger.info("Instance does not have static lazy constraints. Skipping.")
        self.enforced_cids = set(self.sample_predict(sample))
        logger.info("Moving lazy constraints to the pool...")
        constraints = Constraints.from_sample(sample)
        assert constraints.lazy is not None
        assert constraints.names is not None
        selected = [
            (constraints.lazy[i] and constraints.names[i] not in self.enforced_cids)
            for i in range(len(constraints.lazy))
        ]
        n_removed = sum(selected)
        n_kept = sum(constraints.lazy) - n_removed
        self.pool = constraints[selected]
        assert self.pool.names is not None
        solver.internal_solver.remove_constraints(self.pool.names)
        logger.info(f"{n_kept} lazy constraints kept; {n_removed} moved to the pool")
        stats["LazyStatic: Removed"] = n_removed
        stats["LazyStatic: Kept"] = n_kept
        stats["LazyStatic: Restored"] = 0
        self.n_restored = 0
        self.n_iterations = 0
    @overrides
    def fit_xy(
        self,
        x: Dict[ConstraintCategory, np.ndarray],
        y: Dict[ConstraintCategory, np.ndarray],
    ) -> None:
        for c in y.keys():
            assert c in x
            self.classifiers[c] = self.classifier_prototype.clone()
            self.thresholds[c] = self.threshold_prototype.clone()
            self.classifiers[c].fit(x[c], y[c])
            self.thresholds[c].fit(self.classifiers[c], x[c], y[c])
    @overrides
    def iteration_cb(
        self,
        solver: "LearningSolver",
        instance: "Instance",
        model: Any,
    ) -> bool:
        if solver.use_lazy_cb:
            return False
        else:
            return self._check_and_add(solver)
    @overrides
    def lazy_cb(
        self,
        solver: "LearningSolver",
        instance: "Instance",
        model: Any,
    ) -> None:
        self._check_and_add(solver)
    def sample_predict(self, sample: Sample) -> List[ConstraintName]:
        x, y, cids = self._sample_xy_with_cids(sample)
        enforced_cids: List[ConstraintName] = []
        for category in x.keys():
            if category not in self.classifiers:
                continue
            npx = np.array(x[category])
            proba = self.classifiers[category].predict_proba(npx)
            thr = self.thresholds[category].predict(npx)
            pred = list(proba[:, 1] > thr[1])
            for (i, is_selected) in enumerate(pred):
                if is_selected:
                    enforced_cids += [cids[category][i]]
        return enforced_cids
    @overrides
    def sample_xy(
        self,
        _: Optional[Instance],
        sample: Sample,
    ) -> Tuple[
        Dict[ConstraintCategory, List[List[float]]],
        Dict[ConstraintCategory, List[List[float]]],
    ]:
        x, y, __ = self._sample_xy_with_cids(sample)
        return x, y
    def _check_and_add(self, solver: "LearningSolver") -> bool:
        assert solver.internal_solver is not None
        assert self.pool.names is not None
        if len(self.pool.names) == 0:
            logger.info("Lazy constraint pool is empty. Skipping violation check.")
            return False
        self.n_iterations += 1
        logger.info("Finding violated lazy constraints...")
        is_satisfied = solver.internal_solver.are_constraints_satisfied(
            self.pool,
            tol=self.violation_tolerance,
        )
        is_violated = [not i for i in is_satisfied]
        violated_constraints = self.pool[is_violated]
        satisfied_constraints = self.pool[is_satisfied]
        self.pool = satisfied_constraints
        assert violated_constraints.names is not None
        assert satisfied_constraints.names is not None
        n_violated = len(violated_constraints.names)
        n_satisfied = len(satisfied_constraints.names)
        logger.info(f"Found {n_violated} violated lazy constraints found")
        if n_violated > 0:
            logger.info(
                f"Enforcing {n_violated} lazy constraints; "
                f"{n_satisfied} left in the pool..."
            )
            solver.internal_solver.add_constraints(violated_constraints)
            for (i, name) in enumerate(violated_constraints.names):
                self.enforced_cids.add(name)
                self.n_restored += 1
            return True
        else:
            return False
    def _sample_xy_with_cids(
        self, sample: Sample
    ) -> Tuple[
        Dict[ConstraintCategory, List[List[float]]],
        Dict[ConstraintCategory, List[List[float]]],
        Dict[ConstraintCategory, List[ConstraintName]],
    ]:
        x: Dict[ConstraintCategory, List[List[float]]] = {}
        y: Dict[ConstraintCategory, List[List[float]]] = {}
        cids: Dict[ConstraintCategory, List[ConstraintName]] = {}
        instance_features = sample.get_array("static_instance_features")
        constr_features = sample.get_array("lp_constr_features")
        constr_names = sample.get_array("static_constr_names")
        constr_categories = sample.get_array("static_constr_categories")
        constr_lazy = sample.get_array("static_constr_lazy")
        lazy_enforced = sample.get_array("mip_constr_lazy_enforced")
        if constr_features is None:
            constr_features = sample.get_array("static_constr_features")
        assert instance_features is not None
        assert constr_features is not None
        assert constr_names is not None
        assert constr_categories is not None
        assert constr_lazy is not None
        for (cidx, cname) in enumerate(constr_names):
            # Initialize categories
            if not constr_lazy[cidx]:
                continue
            category = constr_categories[cidx]
            if len(category) == 0:
                continue
            if category not in x:
                x[category] = []
                y[category] = []
                cids[category] = []
            # Features
            features = list(instance_features)
            features.extend(constr_features[cidx])
            x[category].append(features)
            cids[category].append(cname)
            # Labels
            if lazy_enforced is not None:
                if cname in lazy_enforced:
                    y[category] += [[False, True]]
                else:
                    y[category] += [[True, False]]
        return x, y, cids
--- a/miplearn/extractors/AlvLouWeh2017.py
+++ b/miplearn/extractors/AlvLouWeh2017.py
@@ -0,0 +1,210 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2022, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 from typing import Tuple, Optional
 import numpy as np
 from miplearn.extractors.abstract import FeaturesExtractor
 from miplearn.h5 import H5File
 class AlvLouWeh2017Extractor(FeaturesExtractor):
    def __init__(
        self,
        with_m1: bool = True,
        with_m2: bool = True,
        with_m3: bool = True,
    ):
        self.with_m1 = with_m1
        self.with_m2 = with_m2
        self.with_m3 = with_m3
    def get_instance_features(self, h5: H5File) -> np.ndarray:
        raise NotImplemented()
    def get_var_features(self, h5: H5File) -> np.ndarray:
        """
        Computes static variable features described in:
            Alvarez, A. M., Louveaux, Q., & Wehenkel, L. (2017). A machine learning-based
            approximation of strong branching. INFORMS Journal on Computing, 29(1),
            185-195.
        """
        A = h5.get_sparse("static_constr_lhs")
        b = h5.get_array("static_constr_rhs")
        c = h5.get_array("static_var_obj_coeffs")
        c_sa_up = h5.get_array("lp_var_sa_obj_up")
        c_sa_down = h5.get_array("lp_var_sa_obj_down")
        values = h5.get_array("lp_var_values")
        assert A is not None
        assert b is not None
        assert c is not None
        nvars = len(c)
        curr = 0
        max_n_features = 40
        features = np.zeros((nvars, max_n_features))
        def push(v: np.ndarray) -> None:
            nonlocal curr
            assert v.shape == (nvars,), f"{v.shape} != ({nvars},)"
            features[:, curr] = v
            curr += 1
        def push_sign_abs(v: np.ndarray) -> None:
            assert v.shape == (nvars,), f"{v.shape} != ({nvars},)"
            push(np.sign(v))
            push(np.abs(v))
        def maxmin(M: np.ndarray) -> Tuple[np.ndarray, np.ndarray]:
            M_max = np.ravel(M.max(axis=0).todense())
            M_min = np.ravel(M.min(axis=0).todense())
            return M_max, M_min
        with np.errstate(divide="ignore", invalid="ignore"):
            # Feature 1
            push(np.sign(c))
            # Feature 2
            c_pos_sum = c[c > 0].sum()
            push(np.abs(c) / c_pos_sum)
            # Feature 3
            c_neg_sum = -c[c < 0].sum()
            push(np.abs(c) / c_neg_sum)
            if A is not None and self.with_m1:
                # Compute A_ji / |b_j|
                M1 = A.T.multiply(1.0 / np.abs(b)).T.tocsr()
                # Select rows with positive b_j and compute max/min
                M1_pos = M1[b > 0, :]
                if M1_pos.shape[0] > 0:
                    M1_pos_max = np.asarray(M1_pos.max(axis=0).todense()).flatten()
                    M1_pos_min = np.asarray(M1_pos.min(axis=0).todense()).flatten()
                else:
                    M1_pos_max = np.zeros(nvars)
                    M1_pos_min = np.zeros(nvars)
                # Select rows with negative b_j and compute max/min
                M1_neg = M1[b < 0, :]
                if M1_neg.shape[0] > 0:
                    M1_neg_max = np.asarray(M1_neg.max(axis=0).todense()).flatten()
                    M1_neg_min = np.asarray(M1_neg.min(axis=0).todense()).flatten()
                else:
                    M1_neg_max = np.zeros(nvars)
                    M1_neg_min = np.zeros(nvars)
                # Features 4-11
                push_sign_abs(M1_pos_min)
                push_sign_abs(M1_pos_max)
                push_sign_abs(M1_neg_min)
                push_sign_abs(M1_neg_max)
            if A is not None and self.with_m2:
                # Compute |c_i| / A_ij
                M2 = A.power(-1).multiply(np.abs(c)).tocsc()
                # Compute max/min
                M2_max, M2_min = maxmin(M2)
                # Make copies of M2 and erase elements based on sign(c)
                M2_pos_max = M2_max.copy()
                M2_neg_max = M2_max.copy()
                M2_pos_min = M2_min.copy()
                M2_neg_min = M2_min.copy()
                M2_pos_max[c <= 0] = 0
                M2_pos_min[c <= 0] = 0
                M2_neg_max[c >= 0] = 0
                M2_neg_min[c >= 0] = 0
                # Features 12-19
                push_sign_abs(M2_pos_min)
                push_sign_abs(M2_pos_max)
                push_sign_abs(M2_neg_min)
                push_sign_abs(M2_neg_max)
            if A is not None and self.with_m3:
                # Compute row sums
                S_pos = A.maximum(0).sum(axis=1)
                S_neg = np.abs(A.minimum(0).sum(axis=1))
                # Divide A by positive and negative row sums
                M3_pos = A.multiply(1 / S_pos).tocsr()
                M3_neg = A.multiply(1 / S_neg).tocsr()
                # Remove +inf and -inf generated by division by zero
                M3_pos.data[~np.isfinite(M3_pos.data)] = 0.0
                M3_neg.data[~np.isfinite(M3_neg.data)] = 0.0
                M3_pos.eliminate_zeros()
                M3_neg.eliminate_zeros()
                # Split each matrix into positive and negative parts
                M3_pos_pos = M3_pos.maximum(0)
                M3_pos_neg = -(M3_pos.minimum(0))
                M3_neg_pos = M3_neg.maximum(0)
                M3_neg_neg = -(M3_neg.minimum(0))
                # Calculate max/min
                M3_pos_pos_max, M3_pos_pos_min = maxmin(M3_pos_pos)
                M3_pos_neg_max, M3_pos_neg_min = maxmin(M3_pos_neg)
                M3_neg_pos_max, M3_neg_pos_min = maxmin(M3_neg_pos)
                M3_neg_neg_max, M3_neg_neg_min = maxmin(M3_neg_neg)
                # Features 20-35
                push_sign_abs(M3_pos_pos_max)
                push_sign_abs(M3_pos_pos_min)
                push_sign_abs(M3_pos_neg_max)
                push_sign_abs(M3_pos_neg_min)
                push_sign_abs(M3_neg_pos_max)
                push_sign_abs(M3_neg_pos_min)
                push_sign_abs(M3_neg_neg_max)
                push_sign_abs(M3_neg_neg_min)
            # Feature 36: only available during B&B
            # Feature 37
            if values is not None:
                push(
                    np.minimum(
                        values - np.floor(values),
                        np.ceil(values) - values,
                    )
                )
            # Features 38-43: only available during B&B
            # Feature 44
            if c_sa_up is not None:
                assert c_sa_down is not None
                # Features 44 and 46
                push(np.sign(c_sa_up))
                push(np.sign(c_sa_down))
                # Feature 45 is duplicated
                # Feature 47-48
                push(np.log(c - c_sa_down / np.sign(c)))
                push(np.log(c - c_sa_up / np.sign(c)))
                # Features 49-64: only available during B&B
        features = features[:, 0:curr]
        _fix_infinity(features)
        return features
    def get_constr_features(self, h5: H5File) -> np.ndarray:
        raise NotImplemented()
 def _fix_infinity(m: Optional[np.ndarray]) -> None:
    if m is None:
        return
    masked = np.ma.masked_invalid(m)  # type: ignore
    max_values = np.max(masked, axis=0)
    min_values = np.min(masked, axis=0)
    m[:] = np.maximum(np.minimum(m, max_values), min_values)
    m[~np.isfinite(m)] = 0.0
--- a/miplearn/extractors/init.py
+++ b/miplearn/extractors/init.py
--- a/miplearn/extractors/abstract.py
+++ b/miplearn/extractors/abstract.py
@@ -0,0 +1,19 @@
 from abc import ABC, abstractmethod
 import numpy as np
 from miplearn.h5 import H5File
 class FeaturesExtractor(ABC):
    @abstractmethod
    def get_instance_features(self, h5: H5File) -> np.ndarray:
        pass
    @abstractmethod
    def get_var_features(self, h5: H5File) -> np.ndarray:
        pass
    @abstractmethod
    def get_constr_features(self, h5: H5File) -> np.ndarray:
        pass
--- a/miplearn/extractors/dummy.py
+++ b/miplearn/extractors/dummy.py
@@ -0,0 +1,24 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2022, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 import numpy as np
 from miplearn.extractors.abstract import FeaturesExtractor
 from miplearn.h5 import H5File
 class DummyExtractor(FeaturesExtractor):
    def get_instance_features(self, h5: H5File) -> np.ndarray:
        return np.zeros(1)
    def get_var_features(self, h5: H5File) -> np.ndarray:
        var_types = h5.get_array("static_var_types")
        assert var_types is not None
        n_vars = len(var_types)
        return np.zeros((n_vars, 1))
    def get_constr_features(self, h5: H5File) -> np.ndarray:
        constr_sense = h5.get_array("static_constr_sense")
        assert constr_sense is not None
        n_constr = len(constr_sense)
        return np.zeros((n_constr, 1))
--- a/miplearn/extractors/fields.py
+++ b/miplearn/extractors/fields.py
@@ -0,0 +1,69 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2022, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 from typing import Optional, List
 import numpy as np
 from miplearn.extractors.abstract import FeaturesExtractor
 from miplearn.h5 import H5File
 class H5FieldsExtractor(FeaturesExtractor):
    def __init__(
        self,
        instance_fields: Optional[List[str]] = None,
        var_fields: Optional[List[str]] = None,
        constr_fields: Optional[List[str]] = None,
    ):
        self.instance_fields = instance_fields
        self.var_fields = var_fields
        self.constr_fields = constr_fields
    def get_instance_features(self, h5: H5File) -> np.ndarray:
        if self.instance_fields is None:
            raise Exception("No instance fields provided")
        x = []
        for field in self.instance_fields:
            try:
                data = h5.get_array(field)
            except ValueError:
                data = h5.get_scalar(field)
            assert data is not None
            x.append(data)
        x = np.hstack(x)
        assert len(x.shape) == 1
        return x
    def get_var_features(self, h5: H5File) -> np.ndarray:
        var_types = h5.get_array("static_var_types")
        assert var_types is not None
        n_vars = len(var_types)
        if self.var_fields is None:
            raise Exception("No var fields provided")
        return self._extract(h5, self.var_fields, n_vars)
    def get_constr_features(self, h5: H5File) -> np.ndarray:
        constr_sense = h5.get_array("static_constr_sense")
        assert constr_sense is not None
        n_constr = len(constr_sense)
        if self.constr_fields is None:
            raise Exception("No constr fields provided")
        return self._extract(h5, self.constr_fields, n_constr)
    def _extract(self, h5, fields, n_expected):
        x = []
        for field in fields:
            try:
                data = h5.get_array(field)
            except ValueError:
                v = h5.get_scalar(field)
                data = np.repeat(v, n_expected)
            assert data is not None
            assert len(data.shape) == 1
            assert data.shape[0] == n_expected
            x.append(data)
        features = np.vstack(x).T
        assert len(features.shape) == 2
        assert features.shape[0] == n_expected
        return features
--- a/miplearn/features/extractor.py
+++ b/miplearn/features/extractor.py
@@ -1,504 +0,0 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2021, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 from math import log, isfinite
 from typing import TYPE_CHECKING, List, Tuple, Optional
 import numpy as np
 from scipy.sparse import coo_matrix
 from miplearn.features.sample import Sample
 from miplearn.solvers.internal import LPSolveStats
 if TYPE_CHECKING:
    from miplearn.solvers.internal import InternalSolver
    from miplearn.instance.base import Instance
 # noinspection PyPep8Naming
 class FeaturesExtractor:
    def __init__(
        self,
        with_sa: bool = True,
        with_lhs: bool = True,
    ) -> None:
        self.with_sa = with_sa
        self.with_lhs = with_lhs
        self.var_features_user: Optional[np.ndarray] = None
    def extract_after_load_features(
        self,
        instance: "Instance",
        solver: "InternalSolver",
        sample: Sample,
    ) -> None:
        variables = solver.get_variables(with_static=True)
        constraints = solver.get_constraints(with_static=True, with_lhs=self.with_lhs)
        assert constraints.names is not None
        sample.put_array("static_var_lower_bounds", variables.lower_bounds)
        sample.put_array("static_var_names", variables.names)
        sample.put_array("static_var_obj_coeffs", variables.obj_coeffs)
        sample.put_array("static_var_types", variables.types)
        sample.put_array("static_var_upper_bounds", variables.upper_bounds)
        sample.put_array("static_constr_names", constraints.names)
        sample.put_sparse("static_constr_lhs", constraints.lhs)
        sample.put_array("static_constr_rhs", constraints.rhs)
        sample.put_array("static_constr_senses", constraints.senses)
        # Instance features
        self._extract_user_features_instance(instance, sample)
        # Constraint features
        (
            constr_features,
            constr_categories,
            constr_lazy,
        ) = FeaturesExtractor._extract_user_features_constrs(
            instance,
            constraints.names,
        )
        sample.put_array("static_constr_features", constr_features)
        sample.put_array("static_constr_categories", constr_categories)
        sample.put_array("static_constr_lazy", constr_lazy)
        sample.put_scalar("static_constr_lazy_count", int(constr_lazy.sum()))
        # Variable features
        (
            vars_features_user,
            var_categories,
        ) = self._extract_user_features_vars(instance, sample)
        self.var_features_user = vars_features_user
        sample.put_array("static_var_categories", var_categories)
        assert variables.lower_bounds is not None
        assert variables.obj_coeffs is not None
        assert variables.upper_bounds is not None
        sample.put_array(
            "static_var_features",
            np.hstack(
                [
                    vars_features_user,
                    self._compute_AlvLouWeh2017(
                        A=constraints.lhs,
                        b=constraints.rhs,
                        c=variables.obj_coeffs,
                    ),
                ]
            ),
        )
    def extract_after_lp_features(
        self,
        solver: "InternalSolver",
        sample: Sample,
        lp_stats: LPSolveStats,
    ) -> None:
        for (k, v) in lp_stats.__dict__.items():
            sample.put_scalar(k, v)
        variables = solver.get_variables(with_static=False, with_sa=self.with_sa)
        constraints = solver.get_constraints(with_static=False, with_sa=self.with_sa)
        sample.put_array("lp_var_basis_status", variables.basis_status)
        sample.put_array("lp_var_reduced_costs", variables.reduced_costs)
        sample.put_array("lp_var_sa_lb_down", variables.sa_lb_down)
        sample.put_array("lp_var_sa_lb_up", variables.sa_lb_up)
        sample.put_array("lp_var_sa_obj_down", variables.sa_obj_down)
        sample.put_array("lp_var_sa_obj_up", variables.sa_obj_up)
        sample.put_array("lp_var_sa_ub_down", variables.sa_ub_down)
        sample.put_array("lp_var_sa_ub_up", variables.sa_ub_up)
        sample.put_array("lp_var_values", variables.values)
        sample.put_array("lp_constr_basis_status", constraints.basis_status)
        sample.put_array("lp_constr_dual_values", constraints.dual_values)
        sample.put_array("lp_constr_sa_rhs_down", constraints.sa_rhs_down)
        sample.put_array("lp_constr_sa_rhs_up", constraints.sa_rhs_up)
        sample.put_array("lp_constr_slacks", constraints.slacks)
        # Variable features
        lp_var_features_list = []
        for f in [
            self.var_features_user,
            self._compute_AlvLouWeh2017(
                A=sample.get_sparse("static_constr_lhs"),
                b=sample.get_array("static_constr_rhs"),
                c=sample.get_array("static_var_obj_coeffs"),
                c_sa_up=variables.sa_obj_up,
                c_sa_down=variables.sa_obj_down,
                values=variables.values,
            ),
        ]:
            if f is not None:
                lp_var_features_list.append(f)
        for f in [
            variables.reduced_costs,
            variables.sa_lb_down,
            variables.sa_lb_up,
            variables.sa_obj_down,
            variables.sa_obj_up,
            variables.sa_ub_down,
            variables.sa_ub_up,
            variables.values,
        ]:
            if f is not None:
                lp_var_features_list.append(f.reshape(-1, 1))
        lp_var_features = np.hstack(lp_var_features_list)
        _fix_infinity(lp_var_features)
        sample.put_array("lp_var_features", lp_var_features)
        # Constraint features
        lp_constr_features_list = []
        for f in [sample.get_array("static_constr_features")]:
            if f is not None:
                lp_constr_features_list.append(f)
        for f in [
            sample.get_array("lp_constr_dual_values"),
            sample.get_array("lp_constr_sa_rhs_down"),
            sample.get_array("lp_constr_sa_rhs_up"),
            sample.get_array("lp_constr_slacks"),
        ]:
            if f is not None:
                lp_constr_features_list.append(f.reshape(-1, 1))
        lp_constr_features = np.hstack(lp_constr_features_list)
        _fix_infinity(lp_constr_features)
        sample.put_array("lp_constr_features", lp_constr_features)
        # Build lp_instance_features
        static_instance_features = sample.get_array("static_instance_features")
        assert static_instance_features is not None
        assert lp_stats.lp_value is not None
        assert lp_stats.lp_wallclock_time is not None
        sample.put_array(
            "lp_instance_features",
            np.hstack(
                [
                    static_instance_features,
                    lp_stats.lp_value,
                    lp_stats.lp_wallclock_time,
                ]
            ),
        )
    def extract_after_mip_features(
        self,
        solver: "InternalSolver",
        sample: Sample,
    ) -> None:
        variables = solver.get_variables(with_static=False, with_sa=False)
        constraints = solver.get_constraints(with_static=False, with_sa=False)
        sample.put_array("mip_var_values", variables.values)
        sample.put_array("mip_constr_slacks", constraints.slacks)
    # noinspection DuplicatedCode
    def _extract_user_features_vars(
        self,
        instance: "Instance",
        sample: Sample,
    ) -> Tuple[np.ndarray, np.ndarray]:
        # Query variable names
        var_names = sample.get_array("static_var_names")
        assert var_names is not None
        # Query variable features
        var_features = instance.get_variable_features(var_names)
        assert isinstance(var_features, np.ndarray), (
            f"Variable features must be a numpy array. "
            f"Found {var_features.__class__} instead."
        )
        assert len(var_features.shape) == 2, (
            f"Variable features must be 2-dimensional array. "
            f"Found array with shape {var_features.shape} instead."
        )
        assert var_features.shape[0] == len(var_names), (
            f"Variable features must have exactly {len(var_names)} rows. "
            f"Found {var_features.shape[0]} rows instead."
        )
        assert var_features.dtype.kind in ["f"], (
            f"Variable features must be floating point numbers. "
            f"Found {var_features.dtype} instead."
        )
        # Query variable categories
        var_categories = instance.get_variable_categories(var_names)
        assert isinstance(var_categories, np.ndarray), (
            f"Variable categories must be a numpy array. "
            f"Found {var_categories.__class__} instead."
        )
        assert len(var_categories.shape) == 1, (
            f"Variable categories must be a vector. "
            f"Found array with shape {var_categories.shape} instead."
        )
        assert len(var_categories) == len(var_names), (
            f"Variable categories must have exactly {len(var_names)} elements. "
            f"Found {var_categories.shape[0]} elements instead."
        )
        assert var_categories.dtype.kind == "S", (
            f"Variable categories must be a numpy array with dtype='S'. "
            f"Found {var_categories.dtype} instead."
        )
        return var_features, var_categories
    # noinspection DuplicatedCode
    @classmethod
    def _extract_user_features_constrs(
        cls,
        instance: "Instance",
        constr_names: np.ndarray,
    ) -> Tuple[np.ndarray, np.ndarray, np.ndarray]:
        # Query constraint features
        constr_features = instance.get_constraint_features(constr_names)
        assert isinstance(constr_features, np.ndarray), (
            f"get_constraint_features must return a numpy array. "
            f"Found {constr_features.__class__} instead."
        )
        assert len(constr_features.shape) == 2, (
            f"get_constraint_features must return a 2-dimensional array. "
            f"Found array with shape {constr_features.shape} instead."
        )
        assert constr_features.shape[0] == len(constr_names), (
            f"get_constraint_features must return an array with {len(constr_names)} "
            f"rows. Found {constr_features.shape[0]} rows instead."
        )
        assert constr_features.dtype.kind in ["f"], (
            f"get_constraint_features must return floating point numbers. "
            f"Found {constr_features.dtype} instead."
        )
        # Query constraint categories
        constr_categories = instance.get_constraint_categories(constr_names)
        assert isinstance(constr_categories, np.ndarray), (
            f"get_constraint_categories must return a numpy array. "
            f"Found {constr_categories.__class__} instead."
        )
        assert len(constr_categories.shape) == 1, (
            f"get_constraint_categories must return a vector. "
            f"Found array with shape {constr_categories.shape} instead."
        )
        assert len(constr_categories) == len(constr_names), (
            f"get_constraint_categories must return a vector with {len(constr_names)} "
            f"elements. Found {constr_categories.shape[0]} elements instead."
        )
        assert constr_categories.dtype.kind == "S", (
            f"get_constraint_categories must return a numpy array with dtype='S'. "
            f"Found {constr_categories.dtype} instead."
        )
        # Query constraint lazy attribute
        constr_lazy = instance.are_constraints_lazy(constr_names)
        assert isinstance(constr_lazy, np.ndarray), (
            f"are_constraints_lazy must return a numpy array. "
            f"Found {constr_lazy.__class__} instead."
        )
        assert len(constr_lazy.shape) == 1, (
            f"are_constraints_lazy must return a vector. "
            f"Found array with shape {constr_lazy.shape} instead."
        )
        assert constr_lazy.shape[0] == len(constr_names), (
            f"are_constraints_lazy must return a vector with {len(constr_names)} "
            f"elements. Found {constr_lazy.shape[0]} elements instead."
        )
        assert constr_lazy.dtype.kind == "b", (
            f"are_constraints_lazy must return a boolean array. "
            f"Found {constr_lazy.dtype} instead."
        )
        return constr_features, constr_categories, constr_lazy
    def _extract_user_features_instance(
        self,
        instance: "Instance",
        sample: Sample,
    ) -> None:
        features = instance.get_instance_features()
        assert isinstance(features, np.ndarray), (
            f"Instance features must be a numpy array. "
            f"Found {features.__class__} instead."
        )
        assert len(features.shape) == 1, (
            f"Instance features must be a vector. "
            f"Found array with shape {features.shape} instead."
        )
        assert features.dtype.kind in [
            "f"
        ], f"Instance features have unsupported {features.dtype}"
        sample.put_array("static_instance_features", features)
    @classmethod
    def _compute_AlvLouWeh2017(
        cls,
        A: Optional[coo_matrix] = None,
        b: Optional[np.ndarray] = None,
        c: Optional[np.ndarray] = None,
        c_sa_down: Optional[np.ndarray] = None,
        c_sa_up: Optional[np.ndarray] = None,
        values: Optional[np.ndarray] = None,
        with_m1: bool = True,
        with_m2: bool = True,
        with_m3: bool = True,
    ) -> np.ndarray:
        """
        Computes static variable features described in:
            Alvarez, A. M., Louveaux, Q., & Wehenkel, L. (2017). A machine learning-based
            approximation of strong branching. INFORMS Journal on Computing, 29(1),
            185-195.
        """
        assert b is not None
        assert c is not None
        nvars = len(c)
        curr = 0
        max_n_features = 40
        features = np.zeros((nvars, max_n_features))
        def push(v: np.ndarray) -> None:
            nonlocal curr
            features[:, curr] = v
            curr += 1
        def push_sign_abs(v: np.ndarray) -> None:
            push(np.sign(v))
            push(np.abs(v))
        def maxmin(M: np.ndarray) -> Tuple[np.ndarray, np.ndarray]:
            M_max = np.ravel(M.max(axis=0).todense())
            M_min = np.ravel(M.min(axis=0).todense())
            return M_max, M_min
        with np.errstate(divide="ignore", invalid="ignore"):
            # Feature 1
            push(np.sign(c))
            # Feature 2
            c_pos_sum = c[c > 0].sum()
            push(np.abs(c) / c_pos_sum)
            # Feature 3
            c_neg_sum = -c[c < 0].sum()
            push(np.abs(c) / c_neg_sum)
            if A is not None and with_m1:
                # Compute A_ji / |b_j|
                M1 = A.T.multiply(1.0 / np.abs(b)).T.tocsr()
                # Select rows with positive b_j and compute max/min
                M1_pos = M1[b > 0, :]
                if M1_pos.shape[0] > 0:
                    M1_pos_max = M1_pos.max(axis=0).todense()
                    M1_pos_min = M1_pos.min(axis=0).todense()
                else:
                    M1_pos_max = np.zeros(nvars)
                    M1_pos_min = np.zeros(nvars)
                # Select rows with negative b_j and compute max/min
                M1_neg = M1[b < 0, :]
                if M1_neg.shape[0] > 0:
                    M1_neg_max = M1_neg.max(axis=0).todense()
                    M1_neg_min = M1_neg.min(axis=0).todense()
                else:
                    M1_neg_max = np.zeros(nvars)
                    M1_neg_min = np.zeros(nvars)
                # Features 4-11
                push_sign_abs(M1_pos_min)
                push_sign_abs(M1_pos_max)
                push_sign_abs(M1_neg_min)
                push_sign_abs(M1_neg_max)
            if A is not None and with_m2:
                # Compute |c_i| / A_ij
                M2 = A.power(-1).multiply(np.abs(c)).tocsc()
                # Compute max/min
                M2_max, M2_min = maxmin(M2)
                # Make copies of M2 and erase elements based on sign(c)
                M2_pos_max = M2_max.copy()
                M2_neg_max = M2_max.copy()
                M2_pos_min = M2_min.copy()
                M2_neg_min = M2_min.copy()
                M2_pos_max[c <= 0] = 0
                M2_pos_min[c <= 0] = 0
                M2_neg_max[c >= 0] = 0
                M2_neg_min[c >= 0] = 0
                # Features 12-19
                push_sign_abs(M2_pos_min)
                push_sign_abs(M2_pos_max)
                push_sign_abs(M2_neg_min)
                push_sign_abs(M2_neg_max)
            if A is not None and with_m3:
                # Compute row sums
                S_pos = A.maximum(0).sum(axis=1)
                S_neg = np.abs(A.minimum(0).sum(axis=1))
                # Divide A by positive and negative row sums
                M3_pos = A.multiply(1 / S_pos).tocsr()
                M3_neg = A.multiply(1 / S_neg).tocsr()
                # Remove +inf and -inf generated by division by zero
                M3_pos.data[~np.isfinite(M3_pos.data)] = 0.0
                M3_neg.data[~np.isfinite(M3_neg.data)] = 0.0
                M3_pos.eliminate_zeros()
                M3_neg.eliminate_zeros()
                # Split each matrix into positive and negative parts
                M3_pos_pos = M3_pos.maximum(0)
                M3_pos_neg = -(M3_pos.minimum(0))
                M3_neg_pos = M3_neg.maximum(0)
                M3_neg_neg = -(M3_neg.minimum(0))
                # Calculate max/min
                M3_pos_pos_max, M3_pos_pos_min = maxmin(M3_pos_pos)
                M3_pos_neg_max, M3_pos_neg_min = maxmin(M3_pos_neg)
                M3_neg_pos_max, M3_neg_pos_min = maxmin(M3_neg_pos)
                M3_neg_neg_max, M3_neg_neg_min = maxmin(M3_neg_neg)
                # Features 20-35
                push_sign_abs(M3_pos_pos_max)
                push_sign_abs(M3_pos_pos_min)
                push_sign_abs(M3_pos_neg_max)
                push_sign_abs(M3_pos_neg_min)
                push_sign_abs(M3_neg_pos_max)
                push_sign_abs(M3_neg_pos_min)
                push_sign_abs(M3_neg_neg_max)
                push_sign_abs(M3_neg_neg_min)
            # Feature 36: only available during B&B
            # Feature 37
            if values is not None:
                push(
                    np.minimum(
                        values - np.floor(values),
                        np.ceil(values) - values,
                    )
                )
            # Features 38-43: only available during B&B
            # Feature 44
            if c_sa_up is not None:
                assert c_sa_down is not None
                # Features 44 and 46
                push(np.sign(c_sa_up))
                push(np.sign(c_sa_down))
                # Feature 45 is duplicated
                # Feature 47-48
                push(np.log(c - c_sa_down / np.sign(c)))
                push(np.log(c - c_sa_up / np.sign(c)))
                # Features 49-64: only available during B&B
        features = features[:, 0:curr]
        _fix_infinity(features)
        return features
 def _fix_infinity(m: Optional[np.ndarray]) -> None:
    if m is None:
        return
    masked = np.ma.masked_invalid(m)
    max_values = np.max(masked, axis=0)
    min_values = np.min(masked, axis=0)
    m[:] = np.maximum(np.minimum(m, max_values), min_values)
    m[~np.isfinite(m)] = 0.0
--- a/miplearn/features/sample.py
+++ b/miplearn/features/sample.py
@@ -1,19 +1,18 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
-#  Copyright (C) 2020-2021, UChicago Argonne, LLC. All rights reserved.
+#  Copyright (C) 2020-2022, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
-import warnings
+
-from abc import ABC, abstractmethod
+from types import TracebackType
-from copy import deepcopy
+from typing import Optional, Any, Union, List, Type, Literal
 from typing import Dict, Optional, Any, Union, List, Tuple, cast, Set
 from scipy.sparse import coo_matrix
 import h5py
 import numpy as np
-from h5py import Dataset
+from scipy.sparse import coo_matrix
 from overrides import overrides
 Bytes = Union[bytes, bytearray]
 Scalar = Union[None, bool, str, int, float]
 Vector = Union[
    None,
    List[bool],
@@ -23,6 +22,7 @@ Vector = Union[
    List[Optional[str]],
    np.ndarray,
 ]
 VectorList = Union[
    List[List[bool]],
    List[List[str]],
@@ -35,115 +35,7 @@ VectorList = Union[
 ]
-class Sample(ABC):
+class H5File:
    """Abstract dictionary-like class that stores training data."""
    @abstractmethod
    def get_scalar(self, key: str) -> Optional[Any]:
        pass
    @abstractmethod
    def put_scalar(self, key: str, value: Scalar) -> None:
        pass
    @abstractmethod
    def put_array(self, key: str, value: Optional[np.ndarray]) -> None:
        pass
    @abstractmethod
    def get_array(self, key: str) -> Optional[np.ndarray]:
        pass
    @abstractmethod
    def put_sparse(self, key: str, value: coo_matrix) -> None:
        pass
    @abstractmethod
    def get_sparse(self, key: str) -> Optional[coo_matrix]:
        pass
    def _assert_is_scalar(self, value: Any) -> None:
        if value is None:
            return
        if isinstance(value, (str, bool, int, float, bytes, np.bytes_)):
            return
        assert False, f"scalar expected; found instead: {value} ({value.__class__})"
    def _assert_is_array(self, value: np.ndarray) -> None:
        assert isinstance(
            value, np.ndarray
        ), f"np.ndarray expected; found instead: {value.__class__}"
        assert value.dtype.kind in "biufS", f"Unsupported dtype: {value.dtype}"
    def _assert_is_sparse(self, value: Any) -> None:
        assert isinstance(
            value, coo_matrix
        ), f"coo_matrix expected; found: {value.__class__}"
        self._assert_is_array(value.data)
 class MemorySample(Sample):
    """Dictionary-like class that stores training data in-memory."""
    def __init__(
        self,
        data: Optional[Dict[str, Any]] = None,
    ) -> None:
        if data is None:
            data = {}
        self._data: Dict[str, Any] = data
    @overrides
    def get_scalar(self, key: str) -> Optional[Any]:
        return self._get(key)
    @overrides
    def put_scalar(self, key: str, value: Scalar) -> None:
        if value is None:
            return
        self._assert_is_scalar(value)
        self._put(key, value)
    def _get(self, key: str) -> Optional[Any]:
        if key in self._data:
            return self._data[key]
        else:
            return None
    def _put(self, key: str, value: Any) -> None:
        self._data[key] = value
    @overrides
    def put_array(self, key: str, value: Optional[np.ndarray]) -> None:
        if value is None:
            return
        self._assert_is_array(value)
        self._put(key, value)
    @overrides
    def get_array(self, key: str) -> Optional[np.ndarray]:
        return cast(Optional[np.ndarray], self._get(key))
    @overrides
    def put_sparse(self, key: str, value: coo_matrix) -> None:
        if value is None:
            return
        self._assert_is_sparse(value)
        self._put(key, value)
    @overrides
    def get_sparse(self, key: str) -> Optional[coo_matrix]:
        return cast(Optional[coo_matrix], self._get(key))
 class Hdf5Sample(Sample):
    """
    Dictionary-like class that stores training data in an HDF5 file.
    Unlike MemorySample, this class only loads to memory the parts of the data set that
    are actually accessed, and therefore it is more scalable.
    """
    def __init__(
        self,
        filename: str,
@@ -151,7 +43,6 @@ class Hdf5Sample(Sample):
    ) -> None:
        self.file = h5py.File(filename, mode, libver="latest")
    @overrides
    def get_scalar(self, key: str) -> Optional[Any]:
        if key not in self.file:
            return None
@@ -164,7 +55,6 @@ class Hdf5Sample(Sample):
        else:
            return ds[()].tolist()
    @overrides
    def put_scalar(self, key: str, value: Any) -> None:
        if value is None:
            return
@@ -173,7 +63,6 @@ class Hdf5Sample(Sample):
            del self.file[key]
        self.file.create_dataset(key, data=value)
    @overrides
    def put_array(self, key: str, value: Optional[np.ndarray]) -> None:
        if value is None:
            return
@@ -184,13 +73,11 @@ class Hdf5Sample(Sample):
            del self.file[key]
        return self.file.create_dataset(key, data=value, compression="gzip")
    @overrides
    def get_array(self, key: str) -> Optional[np.ndarray]:
        if key not in self.file:
            return None
        return self.file[key][:]
    @overrides
    def put_sparse(self, key: str, value: coo_matrix) -> None:
        if value is None:
            return
@@ -199,7 +86,6 @@ class Hdf5Sample(Sample):
        self.put_array(f"{key}_col", value.col)
        self.put_array(f"{key}_data", value.data)
    @overrides
    def get_sparse(self, key: str) -> Optional[coo_matrix]:
        row = self.get_array(f"{key}_row")
        if row is None:
@@ -225,8 +111,36 @@ class Hdf5Sample(Sample):
        ), f"bytes expected; found: {value.__class__}"  # type: ignore
        self.put_array(key, np.frombuffer(value, dtype="uint8"))
-    def __enter__(self):
+    def close(self):
        self.file.close()
    def __enter__(self) -> "H5File":
        return self
-    def __exit__(self, type, value, traceback):
+    def __exit__(
        self,
        exc_type: Optional[Type[BaseException]],
        exc_val: Optional[BaseException],
        exc_tb: Optional[TracebackType],
    ) -> Literal[False]:
        self.file.close()
        return False
    def _assert_is_scalar(self, value: Any) -> None:
        if value is None:
            return
        if isinstance(value, (str, bool, int, float, bytes, np.bytes_)):
            return
        assert False, f"scalar expected; found instead: {value} ({value.__class__})"
    def _assert_is_array(self, value: np.ndarray) -> None:
        assert isinstance(
            value, np.ndarray
        ), f"np.ndarray expected; found instead: {value.__class__}"
        assert value.dtype.kind in "biufS", f"Unsupported dtype: {value.dtype}"
    def _assert_is_sparse(self, value: Any) -> None:
        assert isinstance(
            value, coo_matrix
        ), f"coo_matrix expected; found: {value.__class__}"
        self._assert_is_array(value.data)
--- a/miplearn/instance/init.py
+++ b/miplearn/instance/init.py
@@ -1,3 +0,0 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2021, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
--- a/miplearn/instance/base.py
+++ b/miplearn/instance/base.py
@@ -1,204 +0,0 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2021, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 import logging
 from abc import ABC, abstractmethod
 from typing import Any, List, TYPE_CHECKING, Dict
 import numpy as np
 from miplearn.features.sample import Sample, MemorySample
 from miplearn.types import ConstraintName
 logger = logging.getLogger(__name__)
 if TYPE_CHECKING:
    from miplearn.solvers.learning import InternalSolver
 # noinspection PyMethodMayBeStatic
 class Instance(ABC):
    """
    Abstract class holding all the data necessary to generate a concrete model of the
    proble.
    In the knapsack problem, for example, this class could hold the number of items,
    their weights and costs, as well as the size of the knapsack. Objects
    implementing this class are able to convert themselves into a concrete
    optimization model, which can be optimized by a solver, or into arrays of
    features, which can be provided as inputs to machine learning models.
    """
    def __init__(self) -> None:
        self._samples: List[Sample] = []
    @abstractmethod
    def to_model(self) -> Any:
        """
        Returns the optimization model corresponding to this instance.
        """
        pass
    def get_instance_features(self) -> np.ndarray:
        """
        Returns a 1-dimensional array of (numerical) features describing the
        entire instance.
        The array is used by LearningSolver to determine how similar two instances
        are. It may also be used to predict, in combination with variable-specific
        features, the values of binary decision variables in the problem.
        There is not necessarily a one-to-one correspondence between models and
        instance features: the features may encode only part of the data necessary to
        generate the complete model. Features may also be statistics computed from
        the original data. For example, in the knapsack problem, an implementation
        may decide to provide as instance features only the average weights, average
        prices, number of items and the size of the knapsack.
        The returned array MUST have the same length for all relevant instances of
        the problem. If two instances map into arrays of different lengths,
        they cannot be solved by the same LearningSolver object.
        By default, returns [0.0].
        """
        return np.zeros(1)
    def get_variable_features(self, names: np.ndarray) -> np.ndarray:
        """
        Returns dictionary mapping the name of each variable to a (1-dimensional) list
        of numerical features describing a particular decision variable.
        In combination with instance features, variable features are used by
        LearningSolver to predict, among other things, the optimal value of each
        decision variable before the optimization takes place. In the knapsack
        problem, for example, an implementation could provide as variable features
        the weight and the price of a specific item.
        Like instance features, the arrays returned by this method MUST have the same
        length for all variables within the same category, for all relevant instances
        of the problem.
        If features are not provided for a given variable, MIPLearn will use a
        default set of features.
        By default, returns [[0.0], ..., [0.0]].
        """
        return np.zeros((len(names), 1))
    def get_variable_categories(self, names: np.ndarray) -> np.ndarray:
        """
        Returns a dictionary mapping the name of each variable to its category.
        If two variables have the same category, LearningSolver will use the same
        internal ML model to predict the values of both variables. If a variable is not
        listed in the dictionary, ML models will ignore the variable.
        By default, returns `names`.
        """
        return names
    def get_constraint_features(self, names: np.ndarray) -> np.ndarray:
        return np.zeros((len(names), 1))
    def get_constraint_categories(self, names: np.ndarray) -> np.ndarray:
        return names
    def has_dynamic_lazy_constraints(self) -> bool:
        return False
    def are_constraints_lazy(self, names: np.ndarray) -> np.ndarray:
        return np.zeros(len(names), dtype=bool)
    def find_violated_lazy_constraints(
        self,
        solver: "InternalSolver",
        model: Any,
    ) -> Dict[ConstraintName, Any]:
        """
        Returns lazy constraint violations found for the current solution.
        After solving a model, LearningSolver will ask the instance to identify which
        lazy constraints are violated by the current solution. For each identified
        violation, LearningSolver will then call the enforce_lazy_constraint and
        resolve the problem. The process repeats until no further lazy constraint
        violations are found.
        Violations should be returned in a dictionary mapping the name of the violation
        to some user-specified data that allows the instance to unambiguously generate
        the lazy constraints at a later time. In the Traveling Salesman Problem, for
        example, this function could return a dictionary identifying violated subtour
        inequalities. More concretely, it could return:
            {
                "s1": [1, 2, 3],
                "s2": [4, 5, 6, 7],
            }
        where "s1" and "s2" are the names of the subtours, and [1,2,3] and [4,5,6,7]
        are the cities in each subtour. The names of the violations should be kept
        stable across instances. In our example, "s1" should always correspond to
        [1,2,3] across all instances. The user-provided data should be picklable.
        The current solution can be queried with `solver.get_solution()`. If the solver
        is configured to use lazy callbacks, this solution may be non-integer.
        For a concrete example, see TravelingSalesmanInstance.
        """
        return {}
    def enforce_lazy_constraint(
        self,
        solver: "InternalSolver",
        model: Any,
        violation_data: Any,
    ) -> None:
        """
        Adds constraints to the model to ensure that the given violation is fixed.
        This method is typically called immediately after
        `find_violated_lazy_constraints`. The argument `violation_data` is the
        user-provided data, previously returned by `find_violated_lazy_constraints`.
        In the Traveling Salesman Problem, for example, it could be a list of cities
        in the subtour.
        After some training, LearningSolver may decide to proactively build some lazy
        constraints at the beginning of the optimization process, before a solution
        is even available. In this case, `enforce_lazy_constraints` will be called
        without a corresponding call to `find_violated_lazy_constraints`.
        For a concrete example, see TravelingSalesmanInstance.
        """
        pass
    def has_user_cuts(self) -> bool:
        return False
    def find_violated_user_cuts(self, model: Any) -> Dict[ConstraintName, Any]:
        return {}
    def enforce_user_cut(
        self,
        solver: "InternalSolver",
        model: Any,
        violation_data: Any,
    ) -> Any:
        return None
    def load(self) -> None:
        pass
    def free(self) -> None:
        pass
    def flush(self) -> None:
        """
        Save any pending changes made to the instance to the underlying data store.
        """
        pass
    def get_samples(self) -> List[Sample]:
        return self._samples
    def create_sample(self) -> Sample:
        sample = MemorySample()
        self._samples.append(sample)
        return sample
--- a/miplearn/instance/file.py
+++ b/miplearn/instance/file.py
@@ -1,131 +0,0 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2021, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 import gc
 import os
 import pickle
 from typing import Any, Optional, List, Dict, TYPE_CHECKING
 import numpy as np
 from overrides import overrides
 from miplearn.features.sample import Hdf5Sample, Sample
 from miplearn.instance.base import Instance
 from miplearn.types import ConstraintName
 if TYPE_CHECKING:
    from miplearn.solvers.learning import InternalSolver
 class FileInstance(Instance):
    def __init__(self, filename: str) -> None:
        super().__init__()
        assert os.path.exists(filename), f"File not found: {filename}"
        self.h5 = Hdf5Sample(filename)
        self.instance: Optional[Instance] = None
    # Delegation
    # -------------------------------------------------------------------------
    @overrides
    def to_model(self) -> Any:
        assert self.instance is not None
        return self.instance.to_model()
    @overrides
    def get_instance_features(self) -> np.ndarray:
        assert self.instance is not None
        return self.instance.get_instance_features()
    @overrides
    def get_variable_features(self, names: np.ndarray) -> np.ndarray:
        assert self.instance is not None
        return self.instance.get_variable_features(names)
    @overrides
    def get_variable_categories(self, names: np.ndarray) -> np.ndarray:
        assert self.instance is not None
        return self.instance.get_variable_categories(names)
    @overrides
    def get_constraint_features(self, names: np.ndarray) -> np.ndarray:
        assert self.instance is not None
        return self.instance.get_constraint_features(names)
    @overrides
    def get_constraint_categories(self, names: np.ndarray) -> np.ndarray:
        assert self.instance is not None
        return self.instance.get_constraint_categories(names)
    @overrides
    def has_dynamic_lazy_constraints(self) -> bool:
        assert self.instance is not None
        return self.instance.has_dynamic_lazy_constraints()
    @overrides
    def are_constraints_lazy(self, names: np.ndarray) -> np.ndarray:
        assert self.instance is not None
        return self.instance.are_constraints_lazy(names)
    @overrides
    def find_violated_lazy_constraints(
        self,
        solver: "InternalSolver",
        model: Any,
    ) -> Dict[ConstraintName, Any]:
        assert self.instance is not None
        return self.instance.find_violated_lazy_constraints(solver, model)
    @overrides
    def enforce_lazy_constraint(
        self,
        solver: "InternalSolver",
        model: Any,
        violation_data: Any,
    ) -> None:
        assert self.instance is not None
        self.instance.enforce_lazy_constraint(solver, model, violation_data)
    @overrides
    def find_violated_user_cuts(self, model: Any) -> Dict[ConstraintName, Any]:
        assert self.instance is not None
        return self.instance.find_violated_user_cuts(model)
    @overrides
    def enforce_user_cut(
        self,
        solver: "InternalSolver",
        model: Any,
        violation_data: Any,
    ) -> None:
        assert self.instance is not None
        self.instance.enforce_user_cut(solver, model, violation_data)
    # Input & Output
    # -------------------------------------------------------------------------
    @overrides
    def free(self) -> None:
        self.instance = None
        gc.collect()
    @overrides
    def load(self) -> None:
        if self.instance is not None:
            return
        pkl = self.h5.get_bytes("pickled")
        assert pkl is not None
        self.instance = pickle.loads(pkl)
        assert isinstance(self.instance, Instance)
    @classmethod
    def save(cls, instance: Instance, filename: str) -> None:
        h5 = Hdf5Sample(filename, mode="w")
        instance_pkl = pickle.dumps(instance)
        h5.put_bytes("pickled", instance_pkl)
    @overrides
    def create_sample(self) -> Sample:
        return self.h5
    @overrides
    def get_samples(self) -> List[Sample]:
        return [self.h5]
--- a/miplearn/instance/picklegz.py
+++ b/miplearn/instance/picklegz.py
@@ -1,195 +0,0 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2021, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 import gc
 import gzip
 import os
 import pickle
 from typing import Optional, Any, List, cast, IO, TYPE_CHECKING, Dict, Callable
 import numpy as np
 from overrides import overrides
 from miplearn.features.sample import Sample
 from miplearn.instance.base import Instance
 from miplearn.types import ConstraintName
 from tqdm.auto import tqdm
 from p_tqdm import p_umap
 if TYPE_CHECKING:
    from miplearn.solvers.learning import InternalSolver
 class PickleGzInstance(Instance):
    """
    An instance backed by a gzipped pickle file.
    The instance is only loaded to memory after an operation is called (for example,
    `to_model`).
    Parameters
    ----------
    filename: str
        Path of the gzipped pickle file that should be loaded.
    """
    # noinspection PyMissingConstructor
    def __init__(self, filename: str) -> None:
        assert os.path.exists(filename), f"File not found: {filename}"
        self.instance: Optional[Instance] = None
        self.filename: str = filename
    @overrides
    def to_model(self) -> Any:
        assert self.instance is not None
        return self.instance.to_model()
    @overrides
    def get_instance_features(self) -> np.ndarray:
        assert self.instance is not None
        return self.instance.get_instance_features()
    @overrides
    def get_variable_features(self, names: np.ndarray) -> np.ndarray:
        assert self.instance is not None
        return self.instance.get_variable_features(names)
    @overrides
    def get_variable_categories(self, names: np.ndarray) -> np.ndarray:
        assert self.instance is not None
        return self.instance.get_variable_categories(names)
    @overrides
    def get_constraint_features(self, names: np.ndarray) -> np.ndarray:
        assert self.instance is not None
        return self.instance.get_constraint_features(names)
    @overrides
    def get_constraint_categories(self, names: np.ndarray) -> np.ndarray:
        assert self.instance is not None
        return self.instance.get_constraint_categories(names)
    @overrides
    def has_dynamic_lazy_constraints(self) -> bool:
        assert self.instance is not None
        return self.instance.has_dynamic_lazy_constraints()
    @overrides
    def are_constraints_lazy(self, names: np.ndarray) -> np.ndarray:
        assert self.instance is not None
        return self.instance.are_constraints_lazy(names)
    @overrides
    def find_violated_lazy_constraints(
        self,
        solver: "InternalSolver",
        model: Any,
    ) -> Dict[ConstraintName, Any]:
        assert self.instance is not None
        return self.instance.find_violated_lazy_constraints(solver, model)
    @overrides
    def enforce_lazy_constraint(
        self,
        solver: "InternalSolver",
        model: Any,
        violation_data: Any,
    ) -> None:
        assert self.instance is not None
        self.instance.enforce_lazy_constraint(solver, model, violation_data)
    @overrides
    def find_violated_user_cuts(self, model: Any) -> Dict[ConstraintName, Any]:
        assert self.instance is not None
        return self.instance.find_violated_user_cuts(model)
    @overrides
    def enforce_user_cut(
        self,
        solver: "InternalSolver",
        model: Any,
        violation_name: Any,
    ) -> None:
        assert self.instance is not None
        self.instance.enforce_user_cut(solver, model, violation_name)
    @overrides
    def load(self) -> None:
        if self.instance is None:
            obj = read_pickle_gz(self.filename)
            assert isinstance(obj, Instance)
            self.instance = obj
    @overrides
    def free(self) -> None:
        self.instance = None  # type: ignore
        gc.collect()
    @overrides
    def flush(self) -> None:
        write_pickle_gz(self.instance, self.filename)
    @overrides
    def get_samples(self) -> List[Sample]:
        assert self.instance is not None
        return self.instance.get_samples()
    @overrides
    def create_sample(self) -> Sample:
        assert self.instance is not None
        return self.instance.create_sample()
 def write_pickle_gz(obj: Any, filename: str) -> None:
    os.makedirs(os.path.dirname(filename), exist_ok=True)
    with gzip.GzipFile(filename, "wb") as file:
        pickle.dump(obj, cast(IO[bytes], file))
 def read_pickle_gz(filename: str) -> Any:
    with gzip.GzipFile(filename, "rb") as file:
        return pickle.load(cast(IO[bytes], file))
 def write_pickle_gz_multiple(objs: List[Any], dirname: str) -> None:
    for (i, obj) in enumerate(objs):
        write_pickle_gz(obj, f"{dirname}/{i:05d}.pkl.gz")
 def save(
    objs: List[Any],
    dirname: str,
    progress: bool = False,
    n_jobs: int = 1,
 ) -> List[str]:
    """
    Saves the provided objects to gzipped pickled files. Files are named sequentially
    as `dirname/00000.pkl.gz`, `dirname/00001.pkl.gz`, etc.
    Parameters
    ----------
    progress: bool
        If True, show progress bar
    objs: List[any]
        List of files to save
    dirname: str
        Output directory
    Returns
    -------
    List containing the relative paths of the saved files.
    """
    def _process(obj, filename):
        write_pickle_gz(obj, filename)
    filenames = [f"{dirname}/{i:05d}.pkl.gz" for i in range(len(objs))]
    p_umap(_process, objs, filenames, num_cpus=n_jobs)
    return filenames
 def load(filename: str, build_model: Callable) -> Any:
    with gzip.GzipFile(filename, "rb") as file:
        data = pickle.load(cast(IO[bytes], file))
        return build_model(data)
--- a/miplearn/io.py
+++ b/miplearn/io.py
@@ -0,0 +1,92 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2022, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 from gzip import GzipFile
 import os
 import pickle
 import sys
 from typing import IO, Any, Callable, List, cast, TextIO
 from .parallel import p_umap
 import shutil
 class _RedirectOutput:
    def __init__(self, streams: List[Any]) -> None:
        self.streams = streams
    def write(self, data: Any) -> None:
        for stream in self.streams:
            stream.write(data)
    def flush(self) -> None:
        for stream in self.streams:
            stream.flush()
    def __enter__(self) -> Any:
        self._original_stdout = sys.stdout
        self._original_stderr = sys.stderr
        sys.stdout = cast(TextIO, self)
        sys.stderr = cast(TextIO, self)
        return self
    def __exit__(
        self,
        _type: Any,
        _value: Any,
        _traceback: Any,
    ) -> None:
        sys.stdout = self._original_stdout
        sys.stderr = self._original_stderr
 def write_pkl_gz(
    objs: List[Any],
    dirname: str,
    prefix: str = "",
    n_jobs: int = 1,
    progress: bool = False,
 ) -> List[str]:
    filenames = [f"{dirname}/{prefix}{i:05d}.pkl.gz" for i in range(len(objs))]
    def _process(i: int) -> None:
        filename = filenames[i]
        obj = objs[i]
        os.makedirs(os.path.dirname(filename), exist_ok=True)
        with GzipFile(filename, "wb") as file:
            pickle.dump(obj, cast(IO[bytes], file))
    if n_jobs > 1:
        p_umap(
            _process,
            range(len(objs)),
            smoothing=0,
            num_cpus=n_jobs,
            maxtasksperchild=None,
            disable=not progress,
        )
    else:
        for i in range(len(objs)):
            _process(i)
    return filenames
 def gzip(filename: str) -> None:
    with open(filename, "rb") as input_file:
        with GzipFile(f"{filename}.gz", "wb") as output_file:
            shutil.copyfileobj(input_file, output_file)
    os.remove(filename)
 def read_pkl_gz(filename: str) -> Any:
    with GzipFile(filename, "rb") as file:
        return pickle.load(cast(IO[bytes], file))
 def _to_h5_filename(data_filename: str) -> str:
    output = f"{data_filename}.h5"
    output = output.replace(".pkl.gz.h5", ".h5")
    output = output.replace(".pkl.h5", ".h5")
    output = output.replace(".jld2.h5", ".h5")
    return output
--- a/miplearn/log.py
+++ b/miplearn/log.py
@@ -1,74 +0,0 @@
 #  MIPLearn: Extensible Framework for Learning-Enhanced Mixed-Integer Optimization
 #  Copyright (C) 2020-2021, UChicago Argonne, LLC. All rights reserved.
 #  Released under the modified BSD license. See COPYING.md for more details.
 import logging
 import sys
 import time
 import traceback
 import warnings
 from typing import Dict, Any, Optional
 _formatwarning = warnings.formatwarning
 class TimeFormatter(logging.Formatter):
    def __init__(
        self,
        start_time: float,
        log_colors: Dict[str, str],
    ) -> None:
        super().__init__()
        self.start_time = start_time
        self.log_colors = log_colors
    def format(self, record: logging.LogRecord) -> str:
        if record.levelno >= logging.ERROR:
            color = self.log_colors["red"]
        elif record.levelno >= logging.WARNING:
            color = self.log_colors["yellow"]
        else:
            color = self.log_colors["green"]
        return "%s[%12.3f]%s %s" % (
            color,
            record.created - self.start_time,
            self.log_colors["reset"],
            record.getMessage(),
        )
 def formatwarning_tb(*args: Any, **kwargs: Any) -> str:
    s = _formatwarning(*args, **kwargs)
    tb = traceback.format_stack()
    s += "".join(tb[:-1])
    return s
 def setup_logger(
    start_time: Optional[float] = None,
    force_color: bool = False,
 ) -> None:
    if start_time is None:
        start_time = time.time()
    if sys.stdout.isatty() or force_color:
        log_colors = {
            "green": "\033[92m",
            "yellow": "\033[93m",
            "red": "\033[91m",
            "reset": "\033[0m",
        }
    else:
        log_colors = {
            "green": "",
            "yellow": "",
            "red": "",
            "reset": "",
        }
    handler = logging.StreamHandler()
    handler.setFormatter(TimeFormatter(start_time, log_colors))
    logging.getLogger().addHandler(handler)
    logging.getLogger("miplearn").setLevel(logging.INFO)
    logging.getLogger("gurobipy").setLevel(logging.ERROR)
    logging.getLogger("pyomo.core").setLevel(logging.ERROR)
    warnings.formatwarning = formatwarning_tb
    logging.captureWarnings(True)
--- a/Show More
+++ b/Show More
`@@ -22,4 +22,4 @@ DISCLAIMER`

	THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.	THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

	`********************************************************************************`	`********************************************************************************`