You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
MIPLearn/docs/search/search_index.json

1 line
43 KiB

{"config":{"lang":["en"],"prebuild_index":false,"separator":"[\\s\\-]+"},"docs":[{"location":"","text":"MIPLearn MIPLearn is an extensible framework for Learning-Enhanced Mixed-Integer Optimization , an approach targeted at discrete optimization problems that need to be repeatedly solved with only minor changes to input data. The package uses Machine Learning (ML) to automatically identify patterns in previously solved instances of the problem, or in the solution process itself, and produces hints that can guide a conventional MIP solver towards the optimal solution faster. For particular classes of problems, this approach has been shown to provide significant performance benefits (see benchmark results and references for more details). Features MIPLearn proposes a flexible problem specification format, which allows users to describe their particular optimization problems to a Learning-Enhanced MIP solver, both from the MIP perspective and from the ML perspective, without making any assumptions on the problem being modeled, the mathematical formulation of the problem, or ML encoding. While the format is very flexible, some constraints are enforced to ensure that it is usable by an actual solver. MIPLearn provides a reference implementation of a Learning-Enhanced Solver , which can use the above problem specification format to automatically predict, based on previously solved instances, a number of hints to accelerate MIP performance. Currently, the reference solver is able to predict: (i) partial solutions which are likely to work well as MIP starts; (ii) an initial set of lazy constraints to enforce; (iii) affine subspaces where the solution is likely to reside; (iv) variable branching priorities to accelerate the exploration of the branch-and-bound tree. The usage of the solver is very straightforward. The most suitable ML models are automatically selected, trained, cross-validated and applied to the problem with no user intervention. MIPLearn provides a set of benchmark problems and random instance generators, covering applications from different domains, which can be used to quickly evaluate new learning-enhanced MIP techniques in a measurable and reproducible way. MIPLearn is customizable and extensible . For MIP and ML researchers exploring new techniques to accelerate MIP performance based on historical data, each component of the reference solver can be individually replaced, extended or customized. Documentation Installation and typical usage Benchmark utilities Benchmark problems, challenges and results Customizing the solver License, authors, references and acknowledgements Souce Code https://github.com/iSoron/miplearn","title":"Home"},{"location":"#miplearn","text":"MIPLearn is an extensible framework for Learning-Enhanced Mixed-Integer Optimization , an approach targeted at discrete optimization problems that need to be repeatedly solved with only minor changes to input data. The package uses Machine Learning (ML) to automatically identify patterns in previously solved instances of the problem, or in the solution process itself, and produces hints that can guide a conventional MIP solver towards the optimal solution faster. For particular classes of problems, this approach has been shown to provide significant performance benefits (see benchmark results and references for more details).","title":"MIPLearn"},{"location":"#features","text":"MIPLearn proposes a flexible problem specification format, which allows users to describe their particular optimization problems to a Learning-Enhanced MIP solver, both from the MIP perspective and from the ML perspective, without making any assumptions on the problem being modeled, the mathematical formulation of the problem, or ML encoding. While the format is very flexible, some constraints are enforced to ensure that it is usable by an actual solver. MIPLearn provides a reference implementation of a Learning-Enhanced Solver , which can use the above problem specification format to automatically predict, based on previously solved instances, a number of hints to accelerate MIP performance. Currently, the reference solver is able to predict: (i) partial solutions which are likely to work well as MIP starts; (ii) an initial set of lazy constraints to enforce; (iii) affine subspaces where the solution is likely to reside; (iv) variable branching priorities to accelerate the exploration of the branch-and-bound tree. The usage of the solver is very straightforward. The most suitable ML models are automatically selected, trained, cross-validated and applied to the problem with no user intervention. MIPLearn provides a set of benchmark problems and random instance generators, covering applications from different domains, which can be used to quickly evaluate new learning-enhanced MIP techniques in a measurable and reproducible way. MIPLearn is customizable and extensible . For MIP and ML researchers exploring new techniques to accelerate MIP performance based on historical data, each component of the reference solver can be individually replaced, extended or customized.","title":"Features"},{"location":"#documentation","text":"Installation and typical usage Benchmark utilities Benchmark problems, challenges and results Customizing the solver License, authors, references and acknowledgements","title":"Documentation"},{"location":"#souce-code","text":"https://github.com/iSoron/miplearn","title":"Souce Code"},{"location":"about/","text":"About Authors Alinson S. Xavier, Argonne National Laboratory < axavier@anl.gov > Feng Qiu, Argonne National Laboratory < fqiu@anl.gov > Acknowledgements Based upon work supported by Laboratory Directed Research and Development (LDRD) funding from Argonne National Laboratory, provided by the Director, Office of Science, of the U.S. Department of Energy under Contract No. DE-AC02-06CH11357. References Learning to Solve Large-Scale Security-Constrained Unit Commitment Problems. Alinson S. Xavier, Feng Qiu, Shabbir Ahmed . INFORMS Journal on Computing (to appear). ArXiv:1902:01696 License MIPLearn, an extensible framework for Learning-Enhanced Mixed-Integer Optimization Copyright (C) 2019-2020 Argonne National Laboratory. All rights reserved.","title":"About"},{"location":"about/#about","text":"","title":"About"},{"location":"about/#authors","text":"Alinson S. Xavier, Argonne National Laboratory < axavier@anl.gov > Feng Qiu, Argonne National Laboratory < fqiu@anl.gov >","title":"Authors"},{"location":"about/#acknowledgements","text":"Based upon work supported by Laboratory Directed Research and Development (LDRD) funding from Argonne National Laboratory, provided by the Director, Office of Science, of the U.S. Department of Energy under Contract No. DE-AC02-06CH11357.","title":"Acknowledgements"},{"location":"about/#references","text":"Learning to Solve Large-Scale Security-Constrained Unit Commitment Problems. Alinson S. Xavier, Feng Qiu, Shabbir Ahmed . INFORMS Journal on Computing (to appear). ArXiv:1902:01696","title":"References"},{"location":"about/#license","text":"MIPLearn, an extensible framework for Learning-Enhanced Mixed-Integer Optimization Copyright (C) 2019-2020 Argonne National Laboratory. All rights reserved.","title":"License"},{"location":"benchmark/","text":"Benchmarks Utilities Using BenchmarkRunner MIPLearn provides the utility class BenchmarkRunner , which simplifies the task of comparing the performance of different solvers. The snippet below shows its basic usage: from miplearn import BenchmarkRunner, LearningSolver # Create train and test instances train_instances = [...] test_instances = [...] # Training phase... training_solver = LearningSolver(...) training_solver.parallel_solve(train_instances, n_jobs=10) training_solver.save_state(\"data.bin\") # Test phase... test_solvers = { \"Baseline\": LearningSolver(...), # each solver may have different parameters \"Strategy A\": LearningSolver(...), \"Strategy B\": LearningSolver(...), \"Strategy C\": LearningSolver(...), } benchmark = BenchmarkRunner(test_solvers) benchmark.load_state(\"data.bin\") benchmark.fit() benchmark.parallel_solve(test_instances, n_jobs=2) print(benchmark.raw_results()) The method load_state loads the saved training data into each one of the provided solvers, while fit trains their respective ML models. The method parallel_solve solves the test instances in parallel, and collects solver statistics such as running time and optimal value. Finally, raw_results produces a table of results (Pandas DataFrame) with the following columns: Solver, the name of the solver. Instance, the sequence number identifying the instance. Wallclock Time, the wallclock running time (in seconds) spent by the solver; Lower Bound, the best lower bound obtained by the solver; Upper Bound, the best upper bound obtained by the solver; Gap, the relative MIP integrality gap at the end of the optimization; Nodes, the number of explored branch-and-bound nodes. In addition to the above, there is also a \"Relative\" version of most columns, where the raw number is compared to the solver which provided the best performance. The Relative Wallclock Time for example, indicates how many times slower this run was when compared to the best time achieved by any solver when processing this instance. For example, if this run took 10 seconds, but the fastest solver took only 5 seconds to solve the same instance, the relative wallclock time would be 2. Saving and loading benchmark results When iteratively exploring new formulations, encoding and solver parameters, it is often desirable to avoid repeating parts of the benchmark suite. For example, if the baseline solver has not been changed, there is no need to evaluate its performance again and again when making small changes to the remaining solvers. BenchmarkRunner provides the methods save_results and load_results , which can be used to avoid this repetition, as the next example shows: # Benchmark baseline solvers and save results to a file. benchmark = BenchmarkRunner(baseline_solvers) benchmark.load_state(\"training_data.bin\") benchmark.parallel_solve(test_instances) benchmark.save_results(\"baseline_results.csv\") # Benchmark remaining solvers, loading baseline results from file. benchmark = BenchmarkRunner(alternative_solvers) benchmark.load_state(\"training_data.bin\") benchmark.load_results(\"baseline_results.csv\") benchmark.parallel_solve(test_instances)","title":"Benchmark"},{"location":"benchmark/#benchmarks-utilities","text":"","title":"Benchmarks Utilities"},{"location":"benchmark/#using-benchmarkrunner","text":"MIPLearn provides the utility class BenchmarkRunner , which simplifies the task of comparing the performance of different solvers. The snippet below shows its basic usage: from miplearn import BenchmarkRunner, LearningSolver # Create train and test instances train_instances = [...] test_instances = [...] # Training phase... training_solver = LearningSolver(...) training_solver.parallel_solve(train_instances, n_jobs=10) training_solver.save_state(\"data.bin\") # Test phase... test_solvers = { \"Baseline\": LearningSolver(...), # each solver may have different parameters \"Strategy A\": LearningSolver(...), \"Strategy B\": LearningSolver(...), \"Strategy C\": LearningSolver(...), } benchmark = BenchmarkRunner(test_solvers) benchmark.load_state(\"data.bin\") benchmark.fit() benchmark.parallel_solve(test_instances, n_jobs=2) print(benchmark.raw_results()) The method load_state loads the saved training data into each one of the provided solvers, while fit trains their respective ML models. The method parallel_solve solves the test instances in parallel, and collects solver statistics such as running time and optimal value. Finally, raw_results produces a table of results (Pandas DataFrame) with the following columns: Solver, the name of the solver. Instance, the sequence number identifying the instance. Wallclock Time, the wallclock running time (in seconds) spent by the solver; Lower Bound, the best lower bound obtained by the solver; Upper Bound, the best upper bound obtained by the solver; Gap, the relative MIP integrality gap at the end of the optimization; Nodes, the number of explored branch-and-bound nodes. In addition to the above, there is also a \"Relative\" version of most columns, where the raw number is compared to the solver which provided the best performance. The Relative Wallclock Time for example, indicates how many times slower this run was when compared to the best time achieved by any solver when processing this instance. For example, if this run took 10 seconds, but the fastest solver took only 5 seconds to solve the same instance, the relative wallclock time would be 2.","title":"Using BenchmarkRunner"},{"location":"benchmark/#saving-and-loading-benchmark-results","text":"When iteratively exploring new formulations, encoding and solver parameters, it is often desirable to avoid repeating parts of the benchmark suite. For example, if the baseline solver has not been changed, there is no need to evaluate its performance again and again when making small changes to the remaining solvers. BenchmarkRunner provides the methods save_results and load_results , which can be used to avoid this repetition, as the next example shows: # Benchmark baseline solvers and save results to a file. benchmark = BenchmarkRunner(baseline_solvers) benchmark.load_state(\"training_data.bin\") benchmark.parallel_solve(test_instances) benchmark.save_results(\"baseline_results.csv\") # Benchmark remaining solvers, loading baseline results from file. benchmark = BenchmarkRunner(alternative_solvers) benchmark.load_state(\"training_data.bin\") benchmark.load_results(\"baseline_results.csv\") benchmark.parallel_solve(test_instances)","title":"Saving and loading benchmark results"},{"location":"customization/","text":"Customization Selecting the internal MIP solver By default, LearningSolver uses Gurobi as its internal MIP solver. Alternative solvers can be specified through the internal_solver_factory constructor argument. This argument should provide a function (with no arguments) that constructs, configures and returns the desired solver. To select CPLEX, for example: from miplearn import LearningSolver import pyomo.environ as pe def cplex_factory(): cplex = pe.SolverFactory(\"cplex\") cplex.options[\"threads\"] = 4 return cplex solver = LearningSolver(internal_solver_factory=cplex_factory)","title":"Customization"},{"location":"customization/#customization","text":"","title":"Customization"},{"location":"customization/#selecting-the-internal-mip-solver","text":"By default, LearningSolver uses Gurobi as its internal MIP solver. Alternative solvers can be specified through the internal_solver_factory constructor argument. This argument should provide a function (with no arguments) that constructs, configures and returns the desired solver. To select CPLEX, for example: from miplearn import LearningSolver import pyomo.environ as pe def cplex_factory(): cplex = pe.SolverFactory(\"cplex\") cplex.options[\"threads\"] = 4 return cplex solver = LearningSolver(internal_solver_factory=cplex_factory)","title":"Selecting the internal MIP solver"},{"location":"problems/","text":"Benchmark Problems, Challenges and Results MIPLearn provides a selection of benchmark problems and random instance generators, covering applications from different fields, that can be used to evaluate new learning-enhanced MIP techniques in a measurable and reproducible way. In this page, we describe these problems, the included instance generators, and we present some benchmark results for LearningSolver with default parameters. Preliminaries Benchmark challenges When evaluating the performance of a conventional MIP solver, benchmark sets , such as MIPLIB and TSPLIB, are typically used. The performance of newly proposed solvers or solution techniques are typically measured as the average (or total) running time the solver takes to solve the entire benchmark set. For Learning-Enhanced MIP solvers, it is also necessary to specify what instances should the solver be trained on (the training instances ) before solving the actual set of instances we are interested in (the test instances ). If the training instances are very similar to the test instances, we would expect a Learning-Enhanced Solver to present stronger perfomance benefits. In MIPLearn, each optimization problem comes with a set of benchmark challenges , which specify how should the training and test instances be generated. The first challenges are typically easier, in the sense that training and test instances are very similar. Later challenges gradually make the sets more distinct, and therefore harder to learn from. Baseline results To illustrate the performance of LearningSolver , and to set a baseline for newly proposed techniques, we present in this page, for each benchmark challenge, a small set of computational results measuring the solution speed of the solver and the solution quality with default parameters. For more detailed computational studies, see references . We compare three solvers: baseline: Gurobi 9.0 with default settings (a conventional state-of-the-art MIP solver) ml-exact: LearningSolver with default settings, using Gurobi 9.0 as internal MIP solver ml-heuristic: Same as above, but with mode=\"heuristic\" All experiments presented here were performed on a Linux server (Ubuntu Linux 18.04 LTS) with Intel Xeon Gold 6230s (2 processors, 40 cores, 80 threads) and 256 GB RAM (DDR4, 2933 MHz). All solvers were restricted to use 4 threads, with no time limits, and 10 instances were solved simultaneously at a time. Maximum Weight Stable Set Problem Problem definition Given a simple undirected graph $G=(V,E)$ and weights $w \\in \\mathbb{R}^V$, the problem is to find a stable set $S \\subseteq V$ that maximizes $ \\sum_{v \\in V} w_v$. We recall that a subset $S \\subseteq V$ is a stable set if no two vertices of $S$ are adjacent. This is one of Karp's 21 NP-complete problems. Random instance generator The class MaxWeightStableSetGenerator can be used to generate random instances of this problem, with user-specified probability distributions. When the constructor parameter fix_graph=True is provided, one random Erd\u0151s-R\u00e9nyi graph $G_{n,p}$ is generated during the constructor, where $n$ and $p$ are sampled from user-provided probability distributions n and p . To generate each instance, the generator independently samples each $w_v$ from the user-provided probability distribution w . When fix_graph=False , a new random graph is generated for each instance, while the remaining parameters are sampled in the same way. Benchmark challenges Challenge A Fixed random Erd\u0151s-R\u00e9nyi graph $G_{n,p}$ with $n=200$ and $p=5\\%$ Random vertex weights $w_v \\sim U(100, 150)$ 500 training instances, 50 test instances MaxWeightStableSetGenerator(w=uniform(loc=100., scale=50.), n=randint(low=200, high=201), p=uniform(loc=0.05, scale=0.0), fix_graph=True) Benchmark results Challenge A Multidimensional 0-1 Knapsack Problem Problem definition Given a set of $n$ items and $m$ types of resources (also called knapsacks ), the problem is to find a subset of items that maximizes profit without consuming more resources than it is available. More precisely, the problem is: \\begin{align*} \\text{maximize} & \\sum_{j=1}^n p_j x_j \\\\ \\text{subject to} & \\sum_{j=1}^n w_{ij} x_j \\leq b_i & \\forall i=1,\\ldots,m \\\\ & x_j \\in \\{0,1\\} & \\forall j=1,\\ldots,n \\end{align*} Random instance generator The class MultiKnapsackGenerator can be used to generate random instances of this problem. The number of items $n$ and knapsacks $m$ are sampled from the user-provided probability distributions n and m . The weights $w_{ij}$ are sampled independently from the provided distribution w . The capacity of knapsack $i$ is set to b_i = \\alpha_i \\sum_{j=1}^n w_{ij} where $\\alpha_i$, the tightness ratio, is sampled from the provided probability distribution alpha . To make the instances more challenging, the costs of the items are linearly correlated to their average weights. More specifically, the price of each item $j$ is set to: p_j = \\sum_{i=1}^m \\frac{w_{ij}}{m} + K u_j, where $K$, the correlation coefficient, and $u_j$, the correlation multiplier, are sampled from the provided probability distributions K and u . If fix_w=True is provided, then $w_{ij}$ are kept the same in all generated instances. This also implies that $n$ and $m$ are kept fixed. Although the prices and capacities are derived from $w_{ij}$, as long as u and K are not constants, the generated instances will still not be completely identical. If a probability distribution w_jitter is provided, then item weights will be set to $w_{ij} + \\gamma_{ij}$ where $\\gamma_{ij}$ is sampled from w_jitter . When combined with fix_w=True , this argument may be used to generate instances where the weight of each item is roughly the same, but not exactly identical, across all instances. The prices of the items and the capacities of the knapsacks will be calculated as above, but using these perturbed weights instead. By default, all generated prices, weights and capacities are rounded to the nearest integer number. If round=False is provided, this rounding will be disabled. References Freville, Arnaud, and G\u00e9rard Plateau. An efficient preprocessing procedure for the multidimensional 0\u20131 knapsack problem. Discrete applied mathematics 49.1-3 (1994): 189-212. Fr\u00e9ville, Arnaud. The multidimensional 0\u20131 knapsack problem: An overview. European Journal of Operational Research 155.1 (2004): 1-21.","title":"Problems"},{"location":"problems/#benchmark-problems-challenges-and-results","text":"MIPLearn provides a selection of benchmark problems and random instance generators, covering applications from different fields, that can be used to evaluate new learning-enhanced MIP techniques in a measurable and reproducible way. In this page, we describe these problems, the included instance generators, and we present some benchmark results for LearningSolver with default parameters.","title":"Benchmark Problems, Challenges and Results"},{"location":"problems/#preliminaries","text":"","title":"Preliminaries"},{"location":"problems/#benchmark-challenges","text":"When evaluating the performance of a conventional MIP solver, benchmark sets , such as MIPLIB and TSPLIB, are typically used. The performance of newly proposed solvers or solution techniques are typically measured as the average (or total) running time the solver takes to solve the entire benchmark set. For Learning-Enhanced MIP solvers, it is also necessary to specify what instances should the solver be trained on (the training instances ) before solving the actual set of instances we are interested in (the test instances ). If the training instances are very similar to the test instances, we would expect a Learning-Enhanced Solver to present stronger perfomance benefits. In MIPLearn, each optimization problem comes with a set of benchmark challenges , which specify how should the training and test instances be generated. The first challenges are typically easier, in the sense that training and test instances are very similar. Later challenges gradually make the sets more distinct, and therefore harder to learn from.","title":"Benchmark challenges"},{"location":"problems/#baseline-results","text":"To illustrate the performance of LearningSolver , and to set a baseline for newly proposed techniques, we present in this page, for each benchmark challenge, a small set of computational results measuring the solution speed of the solver and the solution quality with default parameters. For more detailed computational studies, see references . We compare three solvers: baseline: Gurobi 9.0 with default settings (a conventional state-of-the-art MIP solver) ml-exact: LearningSolver with default settings, using Gurobi 9.0 as internal MIP solver ml-heuristic: Same as above, but with mode=\"heuristic\" All experiments presented here were performed on a Linux server (Ubuntu Linux 18.04 LTS) with Intel Xeon Gold 6230s (2 processors, 40 cores, 80 threads) and 256 GB RAM (DDR4, 2933 MHz). All solvers were restricted to use 4 threads, with no time limits, and 10 instances were solved simultaneously at a time.","title":"Baseline results"},{"location":"problems/#maximum-weight-stable-set-problem","text":"","title":"Maximum Weight Stable Set Problem"},{"location":"problems/#problem-definition","text":"Given a simple undirected graph $G=(V,E)$ and weights $w \\in \\mathbb{R}^V$, the problem is to find a stable set $S \\subseteq V$ that maximizes $ \\sum_{v \\in V} w_v$. We recall that a subset $S \\subseteq V$ is a stable set if no two vertices of $S$ are adjacent. This is one of Karp's 21 NP-complete problems.","title":"Problem definition"},{"location":"problems/#random-instance-generator","text":"The class MaxWeightStableSetGenerator can be used to generate random instances of this problem, with user-specified probability distributions. When the constructor parameter fix_graph=True is provided, one random Erd\u0151s-R\u00e9nyi graph $G_{n,p}$ is generated during the constructor, where $n$ and $p$ are sampled from user-provided probability distributions n and p . To generate each instance, the generator independently samples each $w_v$ from the user-provided probability distribution w . When fix_graph=False , a new random graph is generated for each instance, while the remaining parameters are sampled in the same way.","title":"Random instance generator"},{"location":"problems/#benchmark-challenges_1","text":"","title":"Benchmark challenges"},{"location":"problems/#challenge-a","text":"Fixed random Erd\u0151s-R\u00e9nyi graph $G_{n,p}$ with $n=200$ and $p=5\\%$ Random vertex weights $w_v \\sim U(100, 150)$ 500 training instances, 50 test instances MaxWeightStableSetGenerator(w=uniform(loc=100., scale=50.), n=randint(low=200, high=201), p=uniform(loc=0.05, scale=0.0), fix_graph=True)","title":"Challenge A"},{"location":"problems/#benchmark-results","text":"","title":"Benchmark results"},{"location":"problems/#challenge-a_1","text":"","title":"Challenge A"},{"location":"problems/#multidimensional-0-1-knapsack-problem","text":"","title":"Multidimensional 0-1 Knapsack Problem"},{"location":"problems/#problem-definition_1","text":"Given a set of $n$ items and $m$ types of resources (also called knapsacks ), the problem is to find a subset of items that maximizes profit without consuming more resources than it is available. More precisely, the problem is: \\begin{align*} \\text{maximize} & \\sum_{j=1}^n p_j x_j \\\\ \\text{subject to} & \\sum_{j=1}^n w_{ij} x_j \\leq b_i & \\forall i=1,\\ldots,m \\\\ & x_j \\in \\{0,1\\} & \\forall j=1,\\ldots,n \\end{align*}","title":"Problem definition"},{"location":"problems/#random-instance-generator_1","text":"The class MultiKnapsackGenerator can be used to generate random instances of this problem. The number of items $n$ and knapsacks $m$ are sampled from the user-provided probability distributions n and m . The weights $w_{ij}$ are sampled independently from the provided distribution w . The capacity of knapsack $i$ is set to b_i = \\alpha_i \\sum_{j=1}^n w_{ij} where $\\alpha_i$, the tightness ratio, is sampled from the provided probability distribution alpha . To make the instances more challenging, the costs of the items are linearly correlated to their average weights. More specifically, the price of each item $j$ is set to: p_j = \\sum_{i=1}^m \\frac{w_{ij}}{m} + K u_j, where $K$, the correlation coefficient, and $u_j$, the correlation multiplier, are sampled from the provided probability distributions K and u . If fix_w=True is provided, then $w_{ij}$ are kept the same in all generated instances. This also implies that $n$ and $m$ are kept fixed. Although the prices and capacities are derived from $w_{ij}$, as long as u and K are not constants, the generated instances will still not be completely identical. If a probability distribution w_jitter is provided, then item weights will be set to $w_{ij} + \\gamma_{ij}$ where $\\gamma_{ij}$ is sampled from w_jitter . When combined with fix_w=True , this argument may be used to generate instances where the weight of each item is roughly the same, but not exactly identical, across all instances. The prices of the items and the capacities of the knapsacks will be calculated as above, but using these perturbed weights instead. By default, all generated prices, weights and capacities are rounded to the nearest integer number. If round=False is provided, this rounding will be disabled. References Freville, Arnaud, and G\u00e9rard Plateau. An efficient preprocessing procedure for the multidimensional 0\u20131 knapsack problem. Discrete applied mathematics 49.1-3 (1994): 189-212. Fr\u00e9ville, Arnaud. The multidimensional 0\u20131 knapsack problem: An overview. European Journal of Operational Research 155.1 (2004): 1-21.","title":"Random instance generator"},{"location":"usage/","text":"Usage Installation The package is currently available for Python and Pyomo. It can be installed as follows: pip install git+ssh://git@github.com/iSoron/miplearn.git A Julia + JuMP version of the package is planned. Using LearningSolver The main class provided by this package is LearningSolver , a reference learning-enhanced MIP solver which automatically extracts information from previous runs to accelerate the solution of new instances. Assuming we already have a list of instances to solve, LearningSolver can be used as follows: from miplearn import LearningSolver all_instances = ... # user-provided list of instances to solve solver = LearningSolver() for instance in all_instances: solver.solve(instance) solver.fit() During the first call to solver.solve(instance) , the solver will process the instance from scratch, since no historical information is available, but it will already start gathering information. By calling solver.fit() , we instruct the solver to train all the internal Machine Learning models based on the information gathered so far. As this operation can be expensive, it may be performed after a larger batch of instances has been solved, instead of after every solve. After the first call to solver.fit() , subsequent calls to solver.solve(instance) will automatically use the trained Machine Learning models to accelerate the solution process. Describing problem instances Instances to be solved by LearningSolver must derive from the abstract class miplearn.Instance . The following three abstract methods must be implemented: instance.to_model() , which returns a concrete Pyomo model corresponding to the instance; instance.get_instance_features() , which returns a 1-dimensional Numpy array of (numerical) features describing the entire instance; instance.get_variable_features(var, index) , which returns a 1-dimensional array of (numerical) features describing a particular decision variable. The first method is used by LearningSolver to construct a concrete Pyomo model, which will be provided to the internal MIP solver. The user should keep a reference to this Pyomo model, in order to retrieve, for example, the optimal variable values. The second and third methods provide an encoding of the instance, which can be used by the ML models to make predictions. In the knapsack problem, for example, an implementation may decide to provide as instance features the average weights, average prices, number of items and the size of the knapsack. The weight and the price of each individual item could be provided as variable features. See miplearn/problems/knapsack.py for a concrete example. An optional method which can be implemented is instance.get_variable_category(var, index) , which returns a category (a string, an integer or any hashable type) for each decision variable. If two variables have the same category, LearningSolver will use the same internal ML model to predict the values of both variables. By default, all variables belong to the \"default\" category, and therefore only one ML model is used for all variables. If the returned category is None , ML predictors will ignore the variable. It is not necessary to have a one-to-one correspondence between features and problem instances. One important (and deliberate) limitation of MIPLearn, however, is that get_instance_features() must always return arrays of same length for all relevant instances of the problem. Similarly, get_variable_features(var, index) must also always return arrays of same length for all variables in each category. It is up to the user to decide how to encode variable-length characteristics of the problem into fixed-length vectors. In graph problems, for example, graph embeddings can be used to reduce the (variable-length) lists of nodes and edges into a fixed-length structure that still preserves some properties of the graph. Different instance encodings may have significant impact on performance. Obtaining heuristic solutions By default, LearningSolver uses Machine Learning to accelerate the MIP solution process, while maintaining all optimality guarantees provided by the MIP solver. In the default mode of operation, for example, predicted optimal solutions are used only as MIP starts. For more significant performance benefits, LearningSolver can also be configured to place additional trust in the Machine Learning predictors, by using the mode=\"heuristic\" constructor argument. When operating in this mode, if a ML model is statistically shown (through stratified k-fold cross validation ) to have exceptionally high accuracy, the solver may decide to restrict the search space based on its predictions. The parts of the solution which the ML models cannot predict accurately will still be explored using traditional (branch-and-bound) methods. For particular applications, this mode has been shown to quickly produce optimal or near-optimal solutions (see references and benchmark results ). Danger The heuristic mode provides no optimality guarantees, and therefore should only be used if the solver is first trained on a large and representative set of training instances. Training on a small or non-representative set of instances may produce low-quality solutions, or make the solver incorrectly classify new instances as infeasible. Saving and loading solver state After solving a large number of training instances, it may be desirable to save the current state of LearningSolver to disk, so that the solver can still use the acquired knowledge after the application restarts. This can be accomplished by using the methods solver.save_state(filename) and solver.load_state(filename) , as the following example illustrates: from miplearn import LearningSolver solver = LearningSolver() for instance in some_instances: solver.solve(instance) solver.fit() solver.save_state(\"/tmp/state.bin\") # Application restarts... solver = LearningSolver() solver.load_state(\"/tmp/state.bin\") for instance in more_instances: solver.solve(instance) In addition to storing the training data, save_state also stores all trained ML models. Therefore, if the the models were trained before saving the state to disk, it is not necessary to train them again after loading. Solving training instances in parallel In many situations, training instances can be solved in parallel to accelerate the training process. LearningSolver provides the method parallel_solve(instances) to easily achieve this: from miplearn import LearningSolver # Training phase... solver = LearningSolver(...) # training solver parameters solver.parallel_solve(training_instances, n_jobs=4) solver.fit() solver.save_state(\"/tmp/data.bin\") # Test phase... solver = LearningSolver(...) # test solver parameters solver.load_state(\"/tmp/data.bin\") solver.solve(test_instance) After all training instances have been solved in parallel, the ML models can be trained and saved to disk as usual, using fit and save_state , as explained in the previous subsections. Current Limitations Only binary and continuous decision variables are currently supported. Solver callbacks (lazy constraints, cutting planes) are not currently supported. Only gurobi_persistent is currently fully supported by all solver components. Other solvers may work if some components are disabled.","title":"Usage"},{"location":"usage/#usage","text":"","title":"Usage"},{"location":"usage/#installation","text":"The package is currently available for Python and Pyomo. It can be installed as follows: pip install git+ssh://git@github.com/iSoron/miplearn.git A Julia + JuMP version of the package is planned.","title":"Installation"},{"location":"usage/#using-learningsolver","text":"The main class provided by this package is LearningSolver , a reference learning-enhanced MIP solver which automatically extracts information from previous runs to accelerate the solution of new instances. Assuming we already have a list of instances to solve, LearningSolver can be used as follows: from miplearn import LearningSolver all_instances = ... # user-provided list of instances to solve solver = LearningSolver() for instance in all_instances: solver.solve(instance) solver.fit() During the first call to solver.solve(instance) , the solver will process the instance from scratch, since no historical information is available, but it will already start gathering information. By calling solver.fit() , we instruct the solver to train all the internal Machine Learning models based on the information gathered so far. As this operation can be expensive, it may be performed after a larger batch of instances has been solved, instead of after every solve. After the first call to solver.fit() , subsequent calls to solver.solve(instance) will automatically use the trained Machine Learning models to accelerate the solution process.","title":"Using LearningSolver"},{"location":"usage/#describing-problem-instances","text":"Instances to be solved by LearningSolver must derive from the abstract class miplearn.Instance . The following three abstract methods must be implemented: instance.to_model() , which returns a concrete Pyomo model corresponding to the instance; instance.get_instance_features() , which returns a 1-dimensional Numpy array of (numerical) features describing the entire instance; instance.get_variable_features(var, index) , which returns a 1-dimensional array of (numerical) features describing a particular decision variable. The first method is used by LearningSolver to construct a concrete Pyomo model, which will be provided to the internal MIP solver. The user should keep a reference to this Pyomo model, in order to retrieve, for example, the optimal variable values. The second and third methods provide an encoding of the instance, which can be used by the ML models to make predictions. In the knapsack problem, for example, an implementation may decide to provide as instance features the average weights, average prices, number of items and the size of the knapsack. The weight and the price of each individual item could be provided as variable features. See miplearn/problems/knapsack.py for a concrete example. An optional method which can be implemented is instance.get_variable_category(var, index) , which returns a category (a string, an integer or any hashable type) for each decision variable. If two variables have the same category, LearningSolver will use the same internal ML model to predict the values of both variables. By default, all variables belong to the \"default\" category, and therefore only one ML model is used for all variables. If the returned category is None , ML predictors will ignore the variable. It is not necessary to have a one-to-one correspondence between features and problem instances. One important (and deliberate) limitation of MIPLearn, however, is that get_instance_features() must always return arrays of same length for all relevant instances of the problem. Similarly, get_variable_features(var, index) must also always return arrays of same length for all variables in each category. It is up to the user to decide how to encode variable-length characteristics of the problem into fixed-length vectors. In graph problems, for example, graph embeddings can be used to reduce the (variable-length) lists of nodes and edges into a fixed-length structure that still preserves some properties of the graph. Different instance encodings may have significant impact on performance.","title":"Describing problem instances"},{"location":"usage/#obtaining-heuristic-solutions","text":"By default, LearningSolver uses Machine Learning to accelerate the MIP solution process, while maintaining all optimality guarantees provided by the MIP solver. In the default mode of operation, for example, predicted optimal solutions are used only as MIP starts. For more significant performance benefits, LearningSolver can also be configured to place additional trust in the Machine Learning predictors, by using the mode=\"heuristic\" constructor argument. When operating in this mode, if a ML model is statistically shown (through stratified k-fold cross validation ) to have exceptionally high accuracy, the solver may decide to restrict the search space based on its predictions. The parts of the solution which the ML models cannot predict accurately will still be explored using traditional (branch-and-bound) methods. For particular applications, this mode has been shown to quickly produce optimal or near-optimal solutions (see references and benchmark results ). Danger The heuristic mode provides no optimality guarantees, and therefore should only be used if the solver is first trained on a large and representative set of training instances. Training on a small or non-representative set of instances may produce low-quality solutions, or make the solver incorrectly classify new instances as infeasible.","title":"Obtaining heuristic solutions"},{"location":"usage/#saving-and-loading-solver-state","text":"After solving a large number of training instances, it may be desirable to save the current state of LearningSolver to disk, so that the solver can still use the acquired knowledge after the application restarts. This can be accomplished by using the methods solver.save_state(filename) and solver.load_state(filename) , as the following example illustrates: from miplearn import LearningSolver solver = LearningSolver() for instance in some_instances: solver.solve(instance) solver.fit() solver.save_state(\"/tmp/state.bin\") # Application restarts... solver = LearningSolver() solver.load_state(\"/tmp/state.bin\") for instance in more_instances: solver.solve(instance) In addition to storing the training data, save_state also stores all trained ML models. Therefore, if the the models were trained before saving the state to disk, it is not necessary to train them again after loading.","title":"Saving and loading solver state"},{"location":"usage/#solving-training-instances-in-parallel","text":"In many situations, training instances can be solved in parallel to accelerate the training process. LearningSolver provides the method parallel_solve(instances) to easily achieve this: from miplearn import LearningSolver # Training phase... solver = LearningSolver(...) # training solver parameters solver.parallel_solve(training_instances, n_jobs=4) solver.fit() solver.save_state(\"/tmp/data.bin\") # Test phase... solver = LearningSolver(...) # test solver parameters solver.load_state(\"/tmp/data.bin\") solver.solve(test_instance) After all training instances have been solved in parallel, the ML models can be trained and saved to disk as usual, using fit and save_state , as explained in the previous subsections.","title":"Solving training instances in parallel"},{"location":"usage/#current-limitations","text":"Only binary and continuous decision variables are currently supported. Solver callbacks (lazy constraints, cutting planes) are not currently supported. Only gurobi_persistent is currently fully supported by all solver components. Other solvers may work if some components are disabled.","title":"Current Limitations"}]}