X

Download Integration of Artificial Intelligence PowerPoint Presentation

SlidesFinder-Advertising-Design.jpg

Login   OR  Register
X


Iframe embed code :



Presentation url :

Home / Computers & Web / Computers & Web Presentations / Integration of Artificial Intelligence PowerPoint Presentation

Integration of Artificial Intelligence PowerPoint Presentation

Ppt Presentation Embed Code   Zoom Ppt Presentation

PowerPoint is the world's most popular presentation software which can let you create professional Integration of Artificial Intelligence powerpoint presentation easily and in no time. This helps you give your presentation on Integration of Artificial Intelligence in a conference, a school lecture, a business proposal, in a webinar and business and professional representations.

The uploader spent his/her valuable time to create this Integration of Artificial Intelligence powerpoint presentation slides, to share his/her useful content with the world. This ppt presentation uploaded by onlinesearch in Computers & Web ppt presentation category is available for free download,and can be used according to your industries like finance, marketing, education, health and many more.

About This Presentation

Integration of Artificial Intelligence Presentation Transcript

Slide 1 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com
Slide 2 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS
Slide 3 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS
Slide 4 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions
Slide 5 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview
Slide 6 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911
Slide 7 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions
Slide 8 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex
Slide 9 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview
Slide 10 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911
Slide 11 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn
Slide 12 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1)
Slide 13 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0
Slide 14 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense
Slide 15 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi
Slide 16 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance
Slide 17 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40];
Slide 18 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40];
Slide 19 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] };
Slide 20 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc
Slide 21 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal
Slide 22 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0
Slide 23 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality
Slide 24 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley
Slide 25 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter
Slide 26 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope
Slide 27 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex
Slide 28 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance
Slide 29 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance
Slide 30 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent.
Slide 31 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network
Slide 32 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming
Slide 33 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming
Slide 34 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete
Slide 35 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search.
Slide 36 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region
Slide 37 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex
Slide 38 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for?
Slide 39 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets
Slide 40 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink
Slide 41 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost
Slide 42 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k
Slide 43 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat
Slide 44 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k
Slide 45 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z
Slide 46 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big).
Slide 47 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node.
Slide 48 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*)
Slide 49 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer
Slide 50 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function
Slide 51 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ...
Slide 52 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i
Slide 53 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1
Slide 54 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j
Slide 55 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi
Slide 56 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily
Slide 57 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours.
Slide 58 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week
Slide 59 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168.
Slide 60 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed.
Slide 61 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break.
Slide 62 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w
Slide 63 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 };
Slide 64 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers.
Slide 65 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc.
Slide 66 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n;
Slide 67 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); }
Slide 68 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); }
Slide 69 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin]
Slide 70 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming
Slide 71 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing
Slide 72 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied.
Slide 73 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility
Slide 74 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints.
Slide 75 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al)
Slide 76 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy.
Slide 77 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization
Slide 78 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers
Slide 79 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP
Slide 80 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other
Slide 81 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ?
Slide 82 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ?
Slide 83 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint
Slide 84 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i
Slide 85 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i]
Slide 86 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis)
Slide 87 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87
Slide 88 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X
Slide 89 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X
Slide 90 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X
Slide 91 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity
Slide 92 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling
Slide 93 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices.
Slide 94 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP
Slide 95 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this
Slide 96 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner
Slide 97 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes
Slide 98 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem
Slide 99 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly
Slide 100 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes
Slide 101 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation
Slide 102 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation
Slide 103 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy
Slide 104 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP
Slide 105 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP
Slide 106 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us
Slide 107 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo
Slide 108 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action
Slide 109 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs
Slide 110 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling
Slide 111 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks
Slide 112 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc
Slide 113 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo
Slide 114 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality
Slide 115 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome
Slide 116 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING
Slide 117 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set.
Slide 118 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set
Slide 119 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i
Slide 120 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable
Slide 121 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region
Slide 122 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs
Slide 123 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley
Slide 124 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP
Slide 125 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution
Slide 126 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1
Slide 127 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region
Slide 128 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9
Slide 129 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T
Slide 130 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t )
Slide 131 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*)
Slide 132 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack
Slide 133 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1
Slide 134 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide
Slide 135 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well );
Slide 136 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); }
Slide 137 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this);
Slide 138 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0;
Slide 139 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 140 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 141 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 142 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 143 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 144 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 145 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 146 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 147 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 148 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 149 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 150 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 151 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 152 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 153 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 154 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 155 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 156 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 157 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 158 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 159 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 160 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 161 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 162 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 163 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 164 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 165 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 166 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 167 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 168 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 169 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 170 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 171 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 172 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 173 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 174 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 175 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 176 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 177 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 178 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 179 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 180 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 181 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i
Slide 182 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000
Slide 183 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,...
Slide 184 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy:
Slide 185 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay
Slide 186 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay
Slide 187 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy)
Slide 188 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance
Slide 189 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0.
Slide 190 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing
Slide 191 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale)
Slide 192 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain
Slide 193 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior.
Slide 194 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194
Slide 195 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195
Slide 196 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution
Slide 197 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior.
Slide 198 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs
Slide 199 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function
Slide 200 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.)
Slide 201 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!!
Slide 202 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup
Slide 203 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato )
Slide 204 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly.
Slide 205 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails
Slide 206 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log)
Slide 207 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000
Slide 208 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3
Slide 209 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy.
Slide 210 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis
Slide 211 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3
Slide 212 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup
Slide 213 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy.
Slide 214 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis
Slide 215 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design
Slide 216 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths.
Slide 217 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search
Slide 218 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values
Slide 219 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values 219 Branch & Bound Depth First vs. Best bound Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound (standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic Depth-first - often quickly reaches an integer solution (may take longer to produce an overall optimal value)
Slide 220 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values 219 Branch & Bound Depth First vs. Best bound Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound (standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic Depth-first - often quickly reaches an integer solution (may take longer to produce an overall optimal value) 220 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk.
Slide 221 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values 219 Branch & Bound Depth First vs. Best bound Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound (standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic Depth-first - often quickly reaches an integer solution (may take longer to produce an overall optimal value) 220 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 221 Depth-first vs. Best-bound (logistics planning) Number of nodes Cumulative Frequencies Depth-First ~50% Best-Bound ~30%
Slide 222 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values 219 Branch & Bound Depth First vs. Best bound Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound (standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic Depth-first - often quickly reaches an integer solution (may take longer to produce an overall optimal value) 220 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 221 Depth-first vs. Best-bound (logistics planning) Number of nodes Cumulative Frequencies Depth-First ~50% Best-Bound ~30% 222 Depth-First and Best and Bound do not dominate each other overall.
Slide 223 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values 219 Branch & Bound Depth First vs. Best bound Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound (standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic Depth-first - often quickly reaches an integer solution (may take longer to produce an overall optimal value) 220 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 221 Depth-first vs. Best-bound (logistics planning) Number of nodes Cumulative Frequencies Depth-First ~50% Best-Bound ~30% 222 Depth-First and Best and Bound do not dominate each other overall. 223 Heavy-tailed behavior of Depth-first
Slide 224 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values 219 Branch & Bound Depth First vs. Best bound Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound (standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic Depth-first - often quickly reaches an integer solution (may take longer to produce an overall optimal value) 220 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 221 Depth-first vs. Best-bound (logistics planning) Number of nodes Cumulative Frequencies Depth-First ~50% Best-Bound ~30% 222 Depth-First and Best and Bound do not dominate each other overall. 223 Heavy-tailed behavior of Depth-first 224 Portfolio for heavy-tailed search procedures (2 processors) 0 DF / 2 BB 2 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios
Slide 225 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values 219 Branch & Bound Depth First vs. Best bound Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound (standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic Depth-first - often quickly reaches an integer solution (may take longer to produce an overall optimal value) 220 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 221 Depth-first vs. Best-bound (logistics planning) Number of nodes Cumulative Frequencies Depth-First ~50% Best-Bound ~30% 222 Depth-First and Best and Bound do not dominate each other overall. 223 Heavy-tailed behavior of Depth-first 224 Portfolio for heavy-tailed search procedures (2 processors) 0 DF / 2 BB 2 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 225 Portfolio for heavy-tailed search procedures (6 processors) 0 DF / 6 BB 6 DF / 0BB Standard deviation of run time of portfolios Expected run time of portfolios 5 DF / 1BB 3 DF / 3 BB 4 DF / 2 BB Efficient set
Slide 226 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values 219 Branch & Bound Depth First vs. Best bound Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound (standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic Depth-first - often quickly reaches an integer solution (may take longer to produce an overall optimal value) 220 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 221 Depth-first vs. Best-bound (logistics planning) Number of nodes Cumulative Frequencies Depth-First ~50% Best-Bound ~30% 222 Depth-First and Best and Bound do not dominate each other overall. 223 Heavy-tailed behavior of Depth-first 224 Portfolio for heavy-tailed search procedures (2 processors) 0 DF / 2 BB 2 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 225 Portfolio for heavy-tailed search procedures (6 processors) 0 DF / 6 BB 6 DF / 0BB Standard deviation of run time of portfolios Expected run time of portfolios 5 DF / 1BB 3 DF / 3 BB 4 DF / 2 BB Efficient set 226 Portfolio for heavy-tailed search procedures (20 processors) 0 DF / 20 BB 20 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios
Slide 227 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values 219 Branch & Bound Depth First vs. Best bound Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound (standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic Depth-first - often quickly reaches an integer solution (may take longer to produce an overall optimal value) 220 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 221 Depth-first vs. Best-bound (logistics planning) Number of nodes Cumulative Frequencies Depth-First ~50% Best-Bound ~30% 222 Depth-First and Best and Bound do not dominate each other overall. 223 Heavy-tailed behavior of Depth-first 224 Portfolio for heavy-tailed search procedures (2 processors) 0 DF / 2 BB 2 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 225 Portfolio for heavy-tailed search procedures (6 processors) 0 DF / 6 BB 6 DF / 0BB Standard deviation of run time of portfolios Expected run time of portfolios 5 DF / 1BB 3 DF / 3 BB 4 DF / 2 BB Efficient set 226 Portfolio for heavy-tailed search procedures (20 processors) 0 DF / 20 BB 20 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 227 Portfolio for heavy-tailed search procedures (2-20 processors)
Slide 228 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values 219 Branch & Bound Depth First vs. Best bound Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound (standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic Depth-first - often quickly reaches an integer solution (may take longer to produce an overall optimal value) 220 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 221 Depth-first vs. Best-bound (logistics planning) Number of nodes Cumulative Frequencies Depth-First ~50% Best-Bound ~30% 222 Depth-First and Best and Bound do not dominate each other overall. 223 Heavy-tailed behavior of Depth-first 224 Portfolio for heavy-tailed search procedures (2 processors) 0 DF / 2 BB 2 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 225 Portfolio for heavy-tailed search procedures (6 processors) 0 DF / 6 BB 6 DF / 0BB Standard deviation of run time of portfolios Expected run time of portfolios 5 DF / 1BB 3 DF / 3 BB 4 DF / 2 BB Efficient set 226 Portfolio for heavy-tailed search procedures (20 processors) 0 DF / 20 BB 20 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 227 Portfolio for heavy-tailed search procedures (2-20 processors) 228 A portfolio approach can lead to substantial improvements in the expected cost and risk of stochastic algorithms, especially in the presence of heavy-tailed phenomena.
Slide 229 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values 219 Branch & Bound Depth First vs. Best bound Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound (standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic Depth-first - often quickly reaches an integer solution (may take longer to produce an overall optimal value) 220 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 221 Depth-first vs. Best-bound (logistics planning) Number of nodes Cumulative Frequencies Depth-First ~50% Best-Bound ~30% 222 Depth-First and Best and Bound do not dominate each other overall. 223 Heavy-tailed behavior of Depth-first 224 Portfolio for heavy-tailed search procedures (2 processors) 0 DF / 2 BB 2 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 225 Portfolio for heavy-tailed search procedures (6 processors) 0 DF / 6 BB 6 DF / 0BB Standard deviation of run time of portfolios Expected run time of portfolios 5 DF / 1BB 3 DF / 3 BB 4 DF / 2 BB Efficient set 226 Portfolio for heavy-tailed search procedures (20 processors) 0 DF / 20 BB 20 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 227 Portfolio for heavy-tailed search procedures (2-20 processors) 228 A portfolio approach can lead to substantial improvements in the expected cost and risk of stochastic algorithms, especially in the presence of heavy-tailed phenomena. 229 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- cuts very long runs --- exploits ultra-short runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates
Slide 230 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values 219 Branch & Bound Depth First vs. Best bound Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound (standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic Depth-first - often quickly reaches an integer solution (may take longer to produce an overall optimal value) 220 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 221 Depth-first vs. Best-bound (logistics planning) Number of nodes Cumulative Frequencies Depth-First ~50% Best-Bound ~30% 222 Depth-First and Best and Bound do not dominate each other overall. 223 Heavy-tailed behavior of Depth-first 224 Portfolio for heavy-tailed search procedures (2 processors) 0 DF / 2 BB 2 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 225 Portfolio for heavy-tailed search procedures (6 processors) 0 DF / 6 BB 6 DF / 0BB Standard deviation of run time of portfolios Expected run time of portfolios 5 DF / 1BB 3 DF / 3 BB 4 DF / 2 BB Efficient set 226 Portfolio for heavy-tailed search procedures (20 processors) 0 DF / 20 BB 20 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 227 Portfolio for heavy-tailed search procedures (2-20 processors) 228 A portfolio approach can lead to substantial improvements in the expected cost and risk of stochastic algorithms, especially in the presence of heavy-tailed phenomena. 229 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- cuts very long runs --- exploits ultra-short runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 230 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- exploits ultra-short runs --- cuts very long runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates
Slide 231 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values 219 Branch & Bound Depth First vs. Best bound Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound (standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic Depth-first - often quickly reaches an integer solution (may take longer to produce an overall optimal value) 220 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 221 Depth-first vs. Best-bound (logistics planning) Number of nodes Cumulative Frequencies Depth-First ~50% Best-Bound ~30% 222 Depth-First and Best and Bound do not dominate each other overall. 223 Heavy-tailed behavior of Depth-first 224 Portfolio for heavy-tailed search procedures (2 processors) 0 DF / 2 BB 2 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 225 Portfolio for heavy-tailed search procedures (6 processors) 0 DF / 6 BB 6 DF / 0BB Standard deviation of run time of portfolios Expected run time of portfolios 5 DF / 1BB 3 DF / 3 BB 4 DF / 2 BB Efficient set 226 Portfolio for heavy-tailed search procedures (20 processors) 0 DF / 20 BB 20 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 227 Portfolio for heavy-tailed search procedures (2-20 processors) 228 A portfolio approach can lead to substantial improvements in the expected cost and risk of stochastic algorithms, especially in the presence of heavy-tailed phenomena. 229 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- cuts very long runs --- exploits ultra-short runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 230 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- exploits ultra-short runs --- cuts very long runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 231 IV. CONCLUSIONS
Slide 232 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values 219 Branch & Bound Depth First vs. Best bound Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound (standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic Depth-first - often quickly reaches an integer solution (may take longer to produce an overall optimal value) 220 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 221 Depth-first vs. Best-bound (logistics planning) Number of nodes Cumulative Frequencies Depth-First ~50% Best-Bound ~30% 222 Depth-First and Best and Bound do not dominate each other overall. 223 Heavy-tailed behavior of Depth-first 224 Portfolio for heavy-tailed search procedures (2 processors) 0 DF / 2 BB 2 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 225 Portfolio for heavy-tailed search procedures (6 processors) 0 DF / 6 BB 6 DF / 0BB Standard deviation of run time of portfolios Expected run time of portfolios 5 DF / 1BB 3 DF / 3 BB 4 DF / 2 BB Efficient set 226 Portfolio for heavy-tailed search procedures (20 processors) 0 DF / 20 BB 20 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 227 Portfolio for heavy-tailed search procedures (2-20 processors) 228 A portfolio approach can lead to substantial improvements in the expected cost and risk of stochastic algorithms, especially in the presence of heavy-tailed phenomena. 229 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- cuts very long runs --- exploits ultra-short runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 230 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- exploits ultra-short runs --- cuts very long runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 231 IV. CONCLUSIONS 232 Important Themes in OR Linear Programming (Mixed) Integer Programming Exploit Structure e.g., Network Flow Problems Duality very elegant theory in LP sensitivity analysis
Slide 233 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values 219 Branch & Bound Depth First vs. Best bound Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound (standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic Depth-first - often quickly reaches an integer solution (may take longer to produce an overall optimal value) 220 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 221 Depth-first vs. Best-bound (logistics planning) Number of nodes Cumulative Frequencies Depth-First ~50% Best-Bound ~30% 222 Depth-First and Best and Bound do not dominate each other overall. 223 Heavy-tailed behavior of Depth-first 224 Portfolio for heavy-tailed search procedures (2 processors) 0 DF / 2 BB 2 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 225 Portfolio for heavy-tailed search procedures (6 processors) 0 DF / 6 BB 6 DF / 0BB Standard deviation of run time of portfolios Expected run time of portfolios 5 DF / 1BB 3 DF / 3 BB 4 DF / 2 BB Efficient set 226 Portfolio for heavy-tailed search procedures (20 processors) 0 DF / 20 BB 20 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 227 Portfolio for heavy-tailed search procedures (2-20 processors) 228 A portfolio approach can lead to substantial improvements in the expected cost and risk of stochastic algorithms, especially in the presence of heavy-tailed phenomena. 229 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- cuts very long runs --- exploits ultra-short runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 230 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- exploits ultra-short runs --- cuts very long runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 231 IV. CONCLUSIONS 232 Important Themes in OR Linear Programming (Mixed) Integer Programming Exploit Structure e.g., Network Flow Problems Duality very elegant theory in LP sensitivity analysis 233 Opportunities for Integration of AI/OR OR methods: Have focused on tractable representations (LP) Have demonstrated the ability to identify optimal and locally optimal solutions LIMITATION: Restricted to rigid models with limited expressive power AI methods: Richer and more flexible representations,supporting constraint-based reasoning mechanisms as well as mixed initiative frameworks, allowing the human expertise to be in the loop. LIMITATION: Rich representations in general lead to intractable problems CHALLENGE: good representations / fast & good solutions ,
Slide 234 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values 219 Branch & Bound Depth First vs. Best bound Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound (standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic Depth-first - often quickly reaches an integer solution (may take longer to produce an overall optimal value) 220 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 221 Depth-first vs. Best-bound (logistics planning) Number of nodes Cumulative Frequencies Depth-First ~50% Best-Bound ~30% 222 Depth-First and Best and Bound do not dominate each other overall. 223 Heavy-tailed behavior of Depth-first 224 Portfolio for heavy-tailed search procedures (2 processors) 0 DF / 2 BB 2 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 225 Portfolio for heavy-tailed search procedures (6 processors) 0 DF / 6 BB 6 DF / 0BB Standard deviation of run time of portfolios Expected run time of portfolios 5 DF / 1BB 3 DF / 3 BB 4 DF / 2 BB Efficient set 226 Portfolio for heavy-tailed search procedures (20 processors) 0 DF / 20 BB 20 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 227 Portfolio for heavy-tailed search procedures (2-20 processors) 228 A portfolio approach can lead to substantial improvements in the expected cost and risk of stochastic algorithms, especially in the presence of heavy-tailed phenomena. 229 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- cuts very long runs --- exploits ultra-short runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 230 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- exploits ultra-short runs --- cuts very long runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 231 IV. CONCLUSIONS 232 Important Themes in OR Linear Programming (Mixed) Integer Programming Exploit Structure e.g., Network Flow Problems Duality very elegant theory in LP sensitivity analysis 233 Opportunities for Integration of AI/OR OR methods: Have focused on tractable representations (LP) Have demonstrated the ability to identify optimal and locally optimal solutions LIMITATION: Restricted to rigid models with limited expressive power AI methods: Richer and more flexible representations,supporting constraint-based reasoning mechanisms as well as mixed initiative frameworks, allowing the human expertise to be in the loop. LIMITATION: Rich representations in general lead to intractable problems CHALLENGE: good representations / fast & good solutions , 234 Opportunities for Integration of AI/OR AI methods are becoming competitive AI methods used to be considered not suitable for realworld scheduling problems. Recent developments have shown they can be competitive. Examples: SAP, Peoplesoft, I2, … -> provide solutions for scheduling combining constraint programming and mathematical programming approaches. ILOG (CP language) has several fielded applications in different scheduling areas; ILOG has integrated a CSP solver with CPLEX. OR people have acknowledge the benefits of combining OR and AI methods
Slide 235 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values 219 Branch & Bound Depth First vs. Best bound Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound (standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic Depth-first - often quickly reaches an integer solution (may take longer to produce an overall optimal value) 220 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 221 Depth-first vs. Best-bound (logistics planning) Number of nodes Cumulative Frequencies Depth-First ~50% Best-Bound ~30% 222 Depth-First and Best and Bound do not dominate each other overall. 223 Heavy-tailed behavior of Depth-first 224 Portfolio for heavy-tailed search procedures (2 processors) 0 DF / 2 BB 2 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 225 Portfolio for heavy-tailed search procedures (6 processors) 0 DF / 6 BB 6 DF / 0BB Standard deviation of run time of portfolios Expected run time of portfolios 5 DF / 1BB 3 DF / 3 BB 4 DF / 2 BB Efficient set 226 Portfolio for heavy-tailed search procedures (20 processors) 0 DF / 20 BB 20 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 227 Portfolio for heavy-tailed search procedures (2-20 processors) 228 A portfolio approach can lead to substantial improvements in the expected cost and risk of stochastic algorithms, especially in the presence of heavy-tailed phenomena. 229 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- cuts very long runs --- exploits ultra-short runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 230 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- exploits ultra-short runs --- cuts very long runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 231 IV. CONCLUSIONS 232 Important Themes in OR Linear Programming (Mixed) Integer Programming Exploit Structure e.g., Network Flow Problems Duality very elegant theory in LP sensitivity analysis 233 Opportunities for Integration of AI/OR OR methods: Have focused on tractable representations (LP) Have demonstrated the ability to identify optimal and locally optimal solutions LIMITATION: Restricted to rigid models with limited expressive power AI methods: Richer and more flexible representations,supporting constraint-based reasoning mechanisms as well as mixed initiative frameworks, allowing the human expertise to be in the loop. LIMITATION: Rich representations in general lead to intractable problems CHALLENGE: good representations / fast & good solutions , 234 Opportunities for Integration of AI/OR AI methods are becoming competitive AI methods used to be considered not suitable for realworld scheduling problems. Recent developments have shown they can be competitive. Examples: SAP, Peoplesoft, I2, … -> provide solutions for scheduling combining constraint programming and mathematical programming approaches. ILOG (CP language) has several fielded applications in different scheduling areas; ILOG has integrated a CSP solver with CPLEX. OR people have acknowledge the benefits of combining OR and AI methods 235 Opportunities for Integration of AI/OR Exploiting Duality in CSP frameworks Exploiting Randomization Hybrid Solvers
Slide 236 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values 219 Branch & Bound Depth First vs. Best bound Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound (standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic Depth-first - often quickly reaches an integer solution (may take longer to produce an overall optimal value) 220 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 221 Depth-first vs. Best-bound (logistics planning) Number of nodes Cumulative Frequencies Depth-First ~50% Best-Bound ~30% 222 Depth-First and Best and Bound do not dominate each other overall. 223 Heavy-tailed behavior of Depth-first 224 Portfolio for heavy-tailed search procedures (2 processors) 0 DF / 2 BB 2 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 225 Portfolio for heavy-tailed search procedures (6 processors) 0 DF / 6 BB 6 DF / 0BB Standard deviation of run time of portfolios Expected run time of portfolios 5 DF / 1BB 3 DF / 3 BB 4 DF / 2 BB Efficient set 226 Portfolio for heavy-tailed search procedures (20 processors) 0 DF / 20 BB 20 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 227 Portfolio for heavy-tailed search procedures (2-20 processors) 228 A portfolio approach can lead to substantial improvements in the expected cost and risk of stochastic algorithms, especially in the presence of heavy-tailed phenomena. 229 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- cuts very long runs --- exploits ultra-short runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 230 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- exploits ultra-short runs --- cuts very long runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 231 IV. CONCLUSIONS 232 Important Themes in OR Linear Programming (Mixed) Integer Programming Exploit Structure e.g., Network Flow Problems Duality very elegant theory in LP sensitivity analysis 233 Opportunities for Integration of AI/OR OR methods: Have focused on tractable representations (LP) Have demonstrated the ability to identify optimal and locally optimal solutions LIMITATION: Restricted to rigid models with limited expressive power AI methods: Richer and more flexible representations,supporting constraint-based reasoning mechanisms as well as mixed initiative frameworks, allowing the human expertise to be in the loop. LIMITATION: Rich representations in general lead to intractable problems CHALLENGE: good representations / fast & good solutions , 234 Opportunities for Integration of AI/OR AI methods are becoming competitive AI methods used to be considered not suitable for realworld scheduling problems. Recent developments have shown they can be competitive. Examples: SAP, Peoplesoft, I2, … -> provide solutions for scheduling combining constraint programming and mathematical programming approaches. ILOG (CP language) has several fielded applications in different scheduling areas; ILOG has integrated a CSP solver with CPLEX. OR people have acknowledge the benefits of combining OR and AI methods 235 Opportunities for Integration of AI/OR Exploiting Duality in CSP frameworks Exploiting Randomization Hybrid Solvers 236 Opportunities for Integration of AI/OR Hybrid Solvers - emerging area of research (CSP+OR); it started with CLP(R), Prolog III and CHIP; ILOG integrates a CSP solver with CPLEX local constraint propagation - local consistency algs global constraint propagation - LP relaxations Only a hybrid approach could prove optimality, e.g.: Hoist scheduling (Rodosek & Wallace 1998) Multicommodity integer network flow problem (Dutch Railways) (McAloon, Tretkoff, Wetzel 1998)
Slide 237 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values 219 Branch & Bound Depth First vs. Best bound Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound (standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic Depth-first - often quickly reaches an integer solution (may take longer to produce an overall optimal value) 220 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 221 Depth-first vs. Best-bound (logistics planning) Number of nodes Cumulative Frequencies Depth-First ~50% Best-Bound ~30% 222 Depth-First and Best and Bound do not dominate each other overall. 223 Heavy-tailed behavior of Depth-first 224 Portfolio for heavy-tailed search procedures (2 processors) 0 DF / 2 BB 2 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 225 Portfolio for heavy-tailed search procedures (6 processors) 0 DF / 6 BB 6 DF / 0BB Standard deviation of run time of portfolios Expected run time of portfolios 5 DF / 1BB 3 DF / 3 BB 4 DF / 2 BB Efficient set 226 Portfolio for heavy-tailed search procedures (20 processors) 0 DF / 20 BB 20 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 227 Portfolio for heavy-tailed search procedures (2-20 processors) 228 A portfolio approach can lead to substantial improvements in the expected cost and risk of stochastic algorithms, especially in the presence of heavy-tailed phenomena. 229 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- cuts very long runs --- exploits ultra-short runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 230 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- exploits ultra-short runs --- cuts very long runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 231 IV. CONCLUSIONS 232 Important Themes in OR Linear Programming (Mixed) Integer Programming Exploit Structure e.g., Network Flow Problems Duality very elegant theory in LP sensitivity analysis 233 Opportunities for Integration of AI/OR OR methods: Have focused on tractable representations (LP) Have demonstrated the ability to identify optimal and locally optimal solutions LIMITATION: Restricted to rigid models with limited expressive power AI methods: Richer and more flexible representations,supporting constraint-based reasoning mechanisms as well as mixed initiative frameworks, allowing the human expertise to be in the loop. LIMITATION: Rich representations in general lead to intractable problems CHALLENGE: good representations / fast & good solutions , 234 Opportunities for Integration of AI/OR AI methods are becoming competitive AI methods used to be considered not suitable for realworld scheduling problems. Recent developments have shown they can be competitive. Examples: SAP, Peoplesoft, I2, … -> provide solutions for scheduling combining constraint programming and mathematical programming approaches. ILOG (CP language) has several fielded applications in different scheduling areas; ILOG has integrated a CSP solver with CPLEX. OR people have acknowledge the benefits of combining OR and AI methods 235 Opportunities for Integration of AI/OR Exploiting Duality in CSP frameworks Exploiting Randomization Hybrid Solvers 236 Opportunities for Integration of AI/OR Hybrid Solvers - emerging area of research (CSP+OR); it started with CLP(R), Prolog III and CHIP; ILOG integrates a CSP solver with CPLEX local constraint propagation - local consistency algs global constraint propagation - LP relaxations Only a hybrid approach could prove optimality, e.g.: Hoist scheduling (Rodosek & Wallace 1998) Multicommodity integer network flow problem (Dutch Railways) (McAloon, Tretkoff, Wetzel 1998) 237 Updated version of tutorial slides www.cs.cornell.edu/gomes/ Talks Demos
Slide 238 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values 219 Branch & Bound Depth First vs. Best bound Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound (standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic Depth-first - often quickly reaches an integer solution (may take longer to produce an overall optimal value) 220 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 221 Depth-first vs. Best-bound (logistics planning) Number of nodes Cumulative Frequencies Depth-First ~50% Best-Bound ~30% 222 Depth-First and Best and Bound do not dominate each other overall. 223 Heavy-tailed behavior of Depth-first 224 Portfolio for heavy-tailed search procedures (2 processors) 0 DF / 2 BB 2 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 225 Portfolio for heavy-tailed search procedures (6 processors) 0 DF / 6 BB 6 DF / 0BB Standard deviation of run time of portfolios Expected run time of portfolios 5 DF / 1BB 3 DF / 3 BB 4 DF / 2 BB Efficient set 226 Portfolio for heavy-tailed search procedures (20 processors) 0 DF / 20 BB 20 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 227 Portfolio for heavy-tailed search procedures (2-20 processors) 228 A portfolio approach can lead to substantial improvements in the expected cost and risk of stochastic algorithms, especially in the presence of heavy-tailed phenomena. 229 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- cuts very long runs --- exploits ultra-short runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 230 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- exploits ultra-short runs --- cuts very long runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 231 IV. CONCLUSIONS 232 Important Themes in OR Linear Programming (Mixed) Integer Programming Exploit Structure e.g., Network Flow Problems Duality very elegant theory in LP sensitivity analysis 233 Opportunities for Integration of AI/OR OR methods: Have focused on tractable representations (LP) Have demonstrated the ability to identify optimal and locally optimal solutions LIMITATION: Restricted to rigid models with limited expressive power AI methods: Richer and more flexible representations,supporting constraint-based reasoning mechanisms as well as mixed initiative frameworks, allowing the human expertise to be in the loop. LIMITATION: Rich representations in general lead to intractable problems CHALLENGE: good representations / fast & good solutions , 234 Opportunities for Integration of AI/OR AI methods are becoming competitive AI methods used to be considered not suitable for realworld scheduling problems. Recent developments have shown they can be competitive. Examples: SAP, Peoplesoft, I2, … -> provide solutions for scheduling combining constraint programming and mathematical programming approaches. ILOG (CP language) has several fielded applications in different scheduling areas; ILOG has integrated a CSP solver with CPLEX. OR people have acknowledge the benefits of combining OR and AI methods 235 Opportunities for Integration of AI/OR Exploiting Duality in CSP frameworks Exploiting Randomization Hybrid Solvers 236 Opportunities for Integration of AI/OR Hybrid Solvers - emerging area of research (CSP+OR); it started with CLP(R), Prolog III and CHIP; ILOG integrates a CSP solver with CPLEX local constraint propagation - local consistency algs global constraint propagation - LP relaxations Only a hybrid approach could prove optimality, e.g.: Hoist scheduling (Rodosek & Wallace 1998) Multicommodity integer network flow problem (Dutch Railways) (McAloon, Tretkoff, Wetzel 1998) 237 Updated version of tutorial slides www.cs.cornell.edu/gomes/ Talks Demos 238 Appendix
Slide 239 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values 219 Branch & Bound Depth First vs. Best bound Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound (standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic Depth-first - often quickly reaches an integer solution (may take longer to produce an overall optimal value) 220 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 221 Depth-first vs. Best-bound (logistics planning) Number of nodes Cumulative Frequencies Depth-First ~50% Best-Bound ~30% 222 Depth-First and Best and Bound do not dominate each other overall. 223 Heavy-tailed behavior of Depth-first 224 Portfolio for heavy-tailed search procedures (2 processors) 0 DF / 2 BB 2 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 225 Portfolio for heavy-tailed search procedures (6 processors) 0 DF / 6 BB 6 DF / 0BB Standard deviation of run time of portfolios Expected run time of portfolios 5 DF / 1BB 3 DF / 3 BB 4 DF / 2 BB Efficient set 226 Portfolio for heavy-tailed search procedures (20 processors) 0 DF / 20 BB 20 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 227 Portfolio for heavy-tailed search procedures (2-20 processors) 228 A portfolio approach can lead to substantial improvements in the expected cost and risk of stochastic algorithms, especially in the presence of heavy-tailed phenomena. 229 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- cuts very long runs --- exploits ultra-short runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 230 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- exploits ultra-short runs --- cuts very long runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 231 IV. CONCLUSIONS 232 Important Themes in OR Linear Programming (Mixed) Integer Programming Exploit Structure e.g., Network Flow Problems Duality very elegant theory in LP sensitivity analysis 233 Opportunities for Integration of AI/OR OR methods: Have focused on tractable representations (LP) Have demonstrated the ability to identify optimal and locally optimal solutions LIMITATION: Restricted to rigid models with limited expressive power AI methods: Richer and more flexible representations,supporting constraint-based reasoning mechanisms as well as mixed initiative frameworks, allowing the human expertise to be in the loop. LIMITATION: Rich representations in general lead to intractable problems CHALLENGE: good representations / fast & good solutions , 234 Opportunities for Integration of AI/OR AI methods are becoming competitive AI methods used to be considered not suitable for realworld scheduling problems. Recent developments have shown they can be competitive. Examples: SAP, Peoplesoft, I2, … -> provide solutions for scheduling combining constraint programming and mathematical programming approaches. ILOG (CP language) has several fielded applications in different scheduling areas; ILOG has integrated a CSP solver with CPLEX. OR people have acknowledge the benefits of combining OR and AI methods 235 Opportunities for Integration of AI/OR Exploiting Duality in CSP frameworks Exploiting Randomization Hybrid Solvers 236 Opportunities for Integration of AI/OR Hybrid Solvers - emerging area of research (CSP+OR); it started with CLP(R), Prolog III and CHIP; ILOG integrates a CSP solver with CPLEX local constraint propagation - local consistency algs global constraint propagation - LP relaxations Only a hybrid approach could prove optimality, e.g.: Hoist scheduling (Rodosek & Wallace 1998) Multicommodity integer network flow problem (Dutch Railways) (McAloon, Tretkoff, Wetzel 1998) 237 Updated version of tutorial slides www.cs.cornell.edu/gomes/ Talks Demos 238 Appendix 239 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. A portfolio has an expected computational cost and a standard deviation, a measure of the dispersion of the computational cost. The standard deviation of the portfolio is a measure of the risk inherent to the portfolio.
Slide 240 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values 219 Branch & Bound Depth First vs. Best bound Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound (standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic Depth-first - often quickly reaches an integer solution (may take longer to produce an overall optimal value) 220 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 221 Depth-first vs. Best-bound (logistics planning) Number of nodes Cumulative Frequencies Depth-First ~50% Best-Bound ~30% 222 Depth-First and Best and Bound do not dominate each other overall. 223 Heavy-tailed behavior of Depth-first 224 Portfolio for heavy-tailed search procedures (2 processors) 0 DF / 2 BB 2 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 225 Portfolio for heavy-tailed search procedures (6 processors) 0 DF / 6 BB 6 DF / 0BB Standard deviation of run time of portfolios Expected run time of portfolios 5 DF / 1BB 3 DF / 3 BB 4 DF / 2 BB Efficient set 226 Portfolio for heavy-tailed search procedures (20 processors) 0 DF / 20 BB 20 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 227 Portfolio for heavy-tailed search procedures (2-20 processors) 228 A portfolio approach can lead to substantial improvements in the expected cost and risk of stochastic algorithms, especially in the presence of heavy-tailed phenomena. 229 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- cuts very long runs --- exploits ultra-short runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 230 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- exploits ultra-short runs --- cuts very long runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 231 IV. CONCLUSIONS 232 Important Themes in OR Linear Programming (Mixed) Integer Programming Exploit Structure e.g., Network Flow Problems Duality very elegant theory in LP sensitivity analysis 233 Opportunities for Integration of AI/OR OR methods: Have focused on tractable representations (LP) Have demonstrated the ability to identify optimal and locally optimal solutions LIMITATION: Restricted to rigid models with limited expressive power AI methods: Richer and more flexible representations,supporting constraint-based reasoning mechanisms as well as mixed initiative frameworks, allowing the human expertise to be in the loop. LIMITATION: Rich representations in general lead to intractable problems CHALLENGE: good representations / fast & good solutions , 234 Opportunities for Integration of AI/OR AI methods are becoming competitive AI methods used to be considered not suitable for realworld scheduling problems. Recent developments have shown they can be competitive. Examples: SAP, Peoplesoft, I2, … -> provide solutions for scheduling combining constraint programming and mathematical programming approaches. ILOG (CP language) has several fielded applications in different scheduling areas; ILOG has integrated a CSP solver with CPLEX. OR people have acknowledge the benefits of combining OR and AI methods 235 Opportunities for Integration of AI/OR Exploiting Duality in CSP frameworks Exploiting Randomization Hybrid Solvers 236 Opportunities for Integration of AI/OR Hybrid Solvers - emerging area of research (CSP+OR); it started with CLP(R), Prolog III and CHIP; ILOG integrates a CSP solver with CPLEX local constraint propagation - local consistency algs global constraint propagation - LP relaxations Only a hybrid approach could prove optimality, e.g.: Hoist scheduling (Rodosek & Wallace 1998) Multicommodity integer network flow problem (Dutch Railways) (McAloon, Tretkoff, Wetzel 1998) 237 Updated version of tutorial slides www.cs.cornell.edu/gomes/ Talks Demos 238 Appendix 239 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. A portfolio has an expected computational cost and a standard deviation, a measure of the dispersion of the computational cost. The standard deviation of the portfolio is a measure of the risk inherent to the portfolio. 240 Portfolio of Algorithms Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk.
Slide 241 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values 219 Branch & Bound Depth First vs. Best bound Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound (standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic Depth-first - often quickly reaches an integer solution (may take longer to produce an overall optimal value) 220 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 221 Depth-first vs. Best-bound (logistics planning) Number of nodes Cumulative Frequencies Depth-First ~50% Best-Bound ~30% 222 Depth-First and Best and Bound do not dominate each other overall. 223 Heavy-tailed behavior of Depth-first 224 Portfolio for heavy-tailed search procedures (2 processors) 0 DF / 2 BB 2 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 225 Portfolio for heavy-tailed search procedures (6 processors) 0 DF / 6 BB 6 DF / 0BB Standard deviation of run time of portfolios Expected run time of portfolios 5 DF / 1BB 3 DF / 3 BB 4 DF / 2 BB Efficient set 226 Portfolio for heavy-tailed search procedures (20 processors) 0 DF / 20 BB 20 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 227 Portfolio for heavy-tailed search procedures (2-20 processors) 228 A portfolio approach can lead to substantial improvements in the expected cost and risk of stochastic algorithms, especially in the presence of heavy-tailed phenomena. 229 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- cuts very long runs --- exploits ultra-short runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 230 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- exploits ultra-short runs --- cuts very long runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 231 IV. CONCLUSIONS 232 Important Themes in OR Linear Programming (Mixed) Integer Programming Exploit Structure e.g., Network Flow Problems Duality very elegant theory in LP sensitivity analysis 233 Opportunities for Integration of AI/OR OR methods: Have focused on tractable representations (LP) Have demonstrated the ability to identify optimal and locally optimal solutions LIMITATION: Restricted to rigid models with limited expressive power AI methods: Richer and more flexible representations,supporting constraint-based reasoning mechanisms as well as mixed initiative frameworks, allowing the human expertise to be in the loop. LIMITATION: Rich representations in general lead to intractable problems CHALLENGE: good representations / fast & good solutions , 234 Opportunities for Integration of AI/OR AI methods are becoming competitive AI methods used to be considered not suitable for realworld scheduling problems. Recent developments have shown they can be competitive. Examples: SAP, Peoplesoft, I2, … -> provide solutions for scheduling combining constraint programming and mathematical programming approaches. ILOG (CP language) has several fielded applications in different scheduling areas; ILOG has integrated a CSP solver with CPLEX. OR people have acknowledge the benefits of combining OR and AI methods 235 Opportunities for Integration of AI/OR Exploiting Duality in CSP frameworks Exploiting Randomization Hybrid Solvers 236 Opportunities for Integration of AI/OR Hybrid Solvers - emerging area of research (CSP+OR); it started with CLP(R), Prolog III and CHIP; ILOG integrates a CSP solver with CPLEX local constraint propagation - local consistency algs global constraint propagation - LP relaxations Only a hybrid approach could prove optimality, e.g.: Hoist scheduling (Rodosek & Wallace 1998) Multicommodity integer network flow problem (Dutch Railways) (McAloon, Tretkoff, Wetzel 1998) 237 Updated version of tutorial slides www.cs.cornell.edu/gomes/ Talks Demos 238 Appendix 239 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. A portfolio has an expected computational cost and a standard deviation, a measure of the dispersion of the computational cost. The standard deviation of the portfolio is a measure of the risk inherent to the portfolio. 240 Portfolio of Algorithms Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 241 Appendix Portfolio of Algorithms Goal: to improve on the performance of the component algorithms in terms of: expected computational cost; risk; Efficient Set or Efficient Frontier - set of portfolios that are the best in terms of expected value and risk. Within the efficient set, in order to minimize the risk, one has to deteriorate the expected value or, in order to improve the expected value, one has to increase the risk.
Slide 242 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values 219 Branch & Bound Depth First vs. Best bound Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound (standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic Depth-first - often quickly reaches an integer solution (may take longer to produce an overall optimal value) 220 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 221 Depth-first vs. Best-bound (logistics planning) Number of nodes Cumulative Frequencies Depth-First ~50% Best-Bound ~30% 222 Depth-First and Best and Bound do not dominate each other overall. 223 Heavy-tailed behavior of Depth-first 224 Portfolio for heavy-tailed search procedures (2 processors) 0 DF / 2 BB 2 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 225 Portfolio for heavy-tailed search procedures (6 processors) 0 DF / 6 BB 6 DF / 0BB Standard deviation of run time of portfolios Expected run time of portfolios 5 DF / 1BB 3 DF / 3 BB 4 DF / 2 BB Efficient set 226 Portfolio for heavy-tailed search procedures (20 processors) 0 DF / 20 BB 20 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 227 Portfolio for heavy-tailed search procedures (2-20 processors) 228 A portfolio approach can lead to substantial improvements in the expected cost and risk of stochastic algorithms, especially in the presence of heavy-tailed phenomena. 229 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- cuts very long runs --- exploits ultra-short runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 230 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- exploits ultra-short runs --- cuts very long runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 231 IV. CONCLUSIONS 232 Important Themes in OR Linear Programming (Mixed) Integer Programming Exploit Structure e.g., Network Flow Problems Duality very elegant theory in LP sensitivity analysis 233 Opportunities for Integration of AI/OR OR methods: Have focused on tractable representations (LP) Have demonstrated the ability to identify optimal and locally optimal solutions LIMITATION: Restricted to rigid models with limited expressive power AI methods: Richer and more flexible representations,supporting constraint-based reasoning mechanisms as well as mixed initiative frameworks, allowing the human expertise to be in the loop. LIMITATION: Rich representations in general lead to intractable problems CHALLENGE: good representations / fast & good solutions , 234 Opportunities for Integration of AI/OR AI methods are becoming competitive AI methods used to be considered not suitable for realworld scheduling problems. Recent developments have shown they can be competitive. Examples: SAP, Peoplesoft, I2, … -> provide solutions for scheduling combining constraint programming and mathematical programming approaches. ILOG (CP language) has several fielded applications in different scheduling areas; ILOG has integrated a CSP solver with CPLEX. OR people have acknowledge the benefits of combining OR and AI methods 235 Opportunities for Integration of AI/OR Exploiting Duality in CSP frameworks Exploiting Randomization Hybrid Solvers 236 Opportunities for Integration of AI/OR Hybrid Solvers - emerging area of research (CSP+OR); it started with CLP(R), Prolog III and CHIP; ILOG integrates a CSP solver with CPLEX local constraint propagation - local consistency algs global constraint propagation - LP relaxations Only a hybrid approach could prove optimality, e.g.: Hoist scheduling (Rodosek & Wallace 1998) Multicommodity integer network flow problem (Dutch Railways) (McAloon, Tretkoff, Wetzel 1998) 237 Updated version of tutorial slides www.cs.cornell.edu/gomes/ Talks Demos 238 Appendix 239 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. A portfolio has an expected computational cost and a standard deviation, a measure of the dispersion of the computational cost. The standard deviation of the portfolio is a measure of the risk inherent to the portfolio. 240 Portfolio of Algorithms Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 241 Appendix Portfolio of Algorithms Goal: to improve on the performance of the component algorithms in terms of: expected computational cost; risk; Efficient Set or Efficient Frontier - set of portfolios that are the best in terms of expected value and risk. Within the efficient set, in order to minimize the risk, one has to deteriorate the expected value or, in order to improve the expected value, one has to increase the risk. 242 Appendix Portfolio of Two Algorithms Let us consider the random variables: A1 - the number of backtracks that algorithm 1 takes to find a solution or prove that a solution doesn’t exist; A2 - the number of backtracks that algorithm 2 takes to find a solution or prove that a solution doesn’t exist;
Slide 243 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values 219 Branch & Bound Depth First vs. Best bound Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound (standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic Depth-first - often quickly reaches an integer solution (may take longer to produce an overall optimal value) 220 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 221 Depth-first vs. Best-bound (logistics planning) Number of nodes Cumulative Frequencies Depth-First ~50% Best-Bound ~30% 222 Depth-First and Best and Bound do not dominate each other overall. 223 Heavy-tailed behavior of Depth-first 224 Portfolio for heavy-tailed search procedures (2 processors) 0 DF / 2 BB 2 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 225 Portfolio for heavy-tailed search procedures (6 processors) 0 DF / 6 BB 6 DF / 0BB Standard deviation of run time of portfolios Expected run time of portfolios 5 DF / 1BB 3 DF / 3 BB 4 DF / 2 BB Efficient set 226 Portfolio for heavy-tailed search procedures (20 processors) 0 DF / 20 BB 20 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 227 Portfolio for heavy-tailed search procedures (2-20 processors) 228 A portfolio approach can lead to substantial improvements in the expected cost and risk of stochastic algorithms, especially in the presence of heavy-tailed phenomena. 229 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- cuts very long runs --- exploits ultra-short runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 230 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- exploits ultra-short runs --- cuts very long runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 231 IV. CONCLUSIONS 232 Important Themes in OR Linear Programming (Mixed) Integer Programming Exploit Structure e.g., Network Flow Problems Duality very elegant theory in LP sensitivity analysis 233 Opportunities for Integration of AI/OR OR methods: Have focused on tractable representations (LP) Have demonstrated the ability to identify optimal and locally optimal solutions LIMITATION: Restricted to rigid models with limited expressive power AI methods: Richer and more flexible representations,supporting constraint-based reasoning mechanisms as well as mixed initiative frameworks, allowing the human expertise to be in the loop. LIMITATION: Rich representations in general lead to intractable problems CHALLENGE: good representations / fast & good solutions , 234 Opportunities for Integration of AI/OR AI methods are becoming competitive AI methods used to be considered not suitable for realworld scheduling problems. Recent developments have shown they can be competitive. Examples: SAP, Peoplesoft, I2, … -> provide solutions for scheduling combining constraint programming and mathematical programming approaches. ILOG (CP language) has several fielded applications in different scheduling areas; ILOG has integrated a CSP solver with CPLEX. OR people have acknowledge the benefits of combining OR and AI methods 235 Opportunities for Integration of AI/OR Exploiting Duality in CSP frameworks Exploiting Randomization Hybrid Solvers 236 Opportunities for Integration of AI/OR Hybrid Solvers - emerging area of research (CSP+OR); it started with CLP(R), Prolog III and CHIP; ILOG integrates a CSP solver with CPLEX local constraint propagation - local consistency algs global constraint propagation - LP relaxations Only a hybrid approach could prove optimality, e.g.: Hoist scheduling (Rodosek & Wallace 1998) Multicommodity integer network flow problem (Dutch Railways) (McAloon, Tretkoff, Wetzel 1998) 237 Updated version of tutorial slides www.cs.cornell.edu/gomes/ Talks Demos 238 Appendix 239 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. A portfolio has an expected computational cost and a standard deviation, a measure of the dispersion of the computational cost. The standard deviation of the portfolio is a measure of the risk inherent to the portfolio. 240 Portfolio of Algorithms Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 241 Appendix Portfolio of Algorithms Goal: to improve on the performance of the component algorithms in terms of: expected computational cost; risk; Efficient Set or Efficient Frontier - set of portfolios that are the best in terms of expected value and risk. Within the efficient set, in order to minimize the risk, one has to deteriorate the expected value or, in order to improve the expected value, one has to increase the risk. 242 Appendix Portfolio of Two Algorithms Let us consider the random variables: A1 - the number of backtracks that algorithm 1 takes to find a solution or prove that a solution doesn’t exist; A2 - the number of backtracks that algorithm 2 takes to find a solution or prove that a solution doesn’t exist; 243 Appendix Portfolio of Two Algorithms Let us consider that we have N processors and we design a portfolio using n1 processors with algorithm 1 and n2 processors with algorithm2 (N = n1 + n2). Let us consider the random variable: X - the number of backtracks that the portfolio takes to find a solution or prove that a solution doesn’t exist;
Slide 244 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values 219 Branch & Bound Depth First vs. Best bound Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound (standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic Depth-first - often quickly reaches an integer solution (may take longer to produce an overall optimal value) 220 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 221 Depth-first vs. Best-bound (logistics planning) Number of nodes Cumulative Frequencies Depth-First ~50% Best-Bound ~30% 222 Depth-First and Best and Bound do not dominate each other overall. 223 Heavy-tailed behavior of Depth-first 224 Portfolio for heavy-tailed search procedures (2 processors) 0 DF / 2 BB 2 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 225 Portfolio for heavy-tailed search procedures (6 processors) 0 DF / 6 BB 6 DF / 0BB Standard deviation of run time of portfolios Expected run time of portfolios 5 DF / 1BB 3 DF / 3 BB 4 DF / 2 BB Efficient set 226 Portfolio for heavy-tailed search procedures (20 processors) 0 DF / 20 BB 20 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 227 Portfolio for heavy-tailed search procedures (2-20 processors) 228 A portfolio approach can lead to substantial improvements in the expected cost and risk of stochastic algorithms, especially in the presence of heavy-tailed phenomena. 229 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- cuts very long runs --- exploits ultra-short runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 230 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- exploits ultra-short runs --- cuts very long runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 231 IV. CONCLUSIONS 232 Important Themes in OR Linear Programming (Mixed) Integer Programming Exploit Structure e.g., Network Flow Problems Duality very elegant theory in LP sensitivity analysis 233 Opportunities for Integration of AI/OR OR methods: Have focused on tractable representations (LP) Have demonstrated the ability to identify optimal and locally optimal solutions LIMITATION: Restricted to rigid models with limited expressive power AI methods: Richer and more flexible representations,supporting constraint-based reasoning mechanisms as well as mixed initiative frameworks, allowing the human expertise to be in the loop. LIMITATION: Rich representations in general lead to intractable problems CHALLENGE: good representations / fast & good solutions , 234 Opportunities for Integration of AI/OR AI methods are becoming competitive AI methods used to be considered not suitable for realworld scheduling problems. Recent developments have shown they can be competitive. Examples: SAP, Peoplesoft, I2, … -> provide solutions for scheduling combining constraint programming and mathematical programming approaches. ILOG (CP language) has several fielded applications in different scheduling areas; ILOG has integrated a CSP solver with CPLEX. OR people have acknowledge the benefits of combining OR and AI methods 235 Opportunities for Integration of AI/OR Exploiting Duality in CSP frameworks Exploiting Randomization Hybrid Solvers 236 Opportunities for Integration of AI/OR Hybrid Solvers - emerging area of research (CSP+OR); it started with CLP(R), Prolog III and CHIP; ILOG integrates a CSP solver with CPLEX local constraint propagation - local consistency algs global constraint propagation - LP relaxations Only a hybrid approach could prove optimality, e.g.: Hoist scheduling (Rodosek & Wallace 1998) Multicommodity integer network flow problem (Dutch Railways) (McAloon, Tretkoff, Wetzel 1998) 237 Updated version of tutorial slides www.cs.cornell.edu/gomes/ Talks Demos 238 Appendix 239 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. A portfolio has an expected computational cost and a standard deviation, a measure of the dispersion of the computational cost. The standard deviation of the portfolio is a measure of the risk inherent to the portfolio. 240 Portfolio of Algorithms Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 241 Appendix Portfolio of Algorithms Goal: to improve on the performance of the component algorithms in terms of: expected computational cost; risk; Efficient Set or Efficient Frontier - set of portfolios that are the best in terms of expected value and risk. Within the efficient set, in order to minimize the risk, one has to deteriorate the expected value or, in order to improve the expected value, one has to increase the risk. 242 Appendix Portfolio of Two Algorithms Let us consider the random variables: A1 - the number of backtracks that algorithm 1 takes to find a solution or prove that a solution doesn’t exist; A2 - the number of backtracks that algorithm 2 takes to find a solution or prove that a solution doesn’t exist; 243 Appendix Portfolio of Two Algorithms Let us consider that we have N processors and we design a portfolio using n1 processors with algorithm 1 and n2 processors with algorithm2 (N = n1 + n2). Let us consider the random variable: X - the number of backtracks that the portfolio takes to find a solution or prove that a solution doesn’t exist; 244 Appendix Portfolio of Two Algorithms Given N processors, and
Slide 245 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values 219 Branch & Bound Depth First vs. Best bound Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound (standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic Depth-first - often quickly reaches an integer solution (may take longer to produce an overall optimal value) 220 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 221 Depth-first vs. Best-bound (logistics planning) Number of nodes Cumulative Frequencies Depth-First ~50% Best-Bound ~30% 222 Depth-First and Best and Bound do not dominate each other overall. 223 Heavy-tailed behavior of Depth-first 224 Portfolio for heavy-tailed search procedures (2 processors) 0 DF / 2 BB 2 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 225 Portfolio for heavy-tailed search procedures (6 processors) 0 DF / 6 BB 6 DF / 0BB Standard deviation of run time of portfolios Expected run time of portfolios 5 DF / 1BB 3 DF / 3 BB 4 DF / 2 BB Efficient set 226 Portfolio for heavy-tailed search procedures (20 processors) 0 DF / 20 BB 20 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 227 Portfolio for heavy-tailed search procedures (2-20 processors) 228 A portfolio approach can lead to substantial improvements in the expected cost and risk of stochastic algorithms, especially in the presence of heavy-tailed phenomena. 229 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- cuts very long runs --- exploits ultra-short runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 230 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- exploits ultra-short runs --- cuts very long runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 231 IV. CONCLUSIONS 232 Important Themes in OR Linear Programming (Mixed) Integer Programming Exploit Structure e.g., Network Flow Problems Duality very elegant theory in LP sensitivity analysis 233 Opportunities for Integration of AI/OR OR methods: Have focused on tractable representations (LP) Have demonstrated the ability to identify optimal and locally optimal solutions LIMITATION: Restricted to rigid models with limited expressive power AI methods: Richer and more flexible representations,supporting constraint-based reasoning mechanisms as well as mixed initiative frameworks, allowing the human expertise to be in the loop. LIMITATION: Rich representations in general lead to intractable problems CHALLENGE: good representations / fast & good solutions , 234 Opportunities for Integration of AI/OR AI methods are becoming competitive AI methods used to be considered not suitable for realworld scheduling problems. Recent developments have shown they can be competitive. Examples: SAP, Peoplesoft, I2, … -> provide solutions for scheduling combining constraint programming and mathematical programming approaches. ILOG (CP language) has several fielded applications in different scheduling areas; ILOG has integrated a CSP solver with CPLEX. OR people have acknowledge the benefits of combining OR and AI methods 235 Opportunities for Integration of AI/OR Exploiting Duality in CSP frameworks Exploiting Randomization Hybrid Solvers 236 Opportunities for Integration of AI/OR Hybrid Solvers - emerging area of research (CSP+OR); it started with CLP(R), Prolog III and CHIP; ILOG integrates a CSP solver with CPLEX local constraint propagation - local consistency algs global constraint propagation - LP relaxations Only a hybrid approach could prove optimality, e.g.: Hoist scheduling (Rodosek & Wallace 1998) Multicommodity integer network flow problem (Dutch Railways) (McAloon, Tretkoff, Wetzel 1998) 237 Updated version of tutorial slides www.cs.cornell.edu/gomes/ Talks Demos 238 Appendix 239 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. A portfolio has an expected computational cost and a standard deviation, a measure of the dispersion of the computational cost. The standard deviation of the portfolio is a measure of the risk inherent to the portfolio. 240 Portfolio of Algorithms Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 241 Appendix Portfolio of Algorithms Goal: to improve on the performance of the component algorithms in terms of: expected computational cost; risk; Efficient Set or Efficient Frontier - set of portfolios that are the best in terms of expected value and risk. Within the efficient set, in order to minimize the risk, one has to deteriorate the expected value or, in order to improve the expected value, one has to increase the risk. 242 Appendix Portfolio of Two Algorithms Let us consider the random variables: A1 - the number of backtracks that algorithm 1 takes to find a solution or prove that a solution doesn’t exist; A2 - the number of backtracks that algorithm 2 takes to find a solution or prove that a solution doesn’t exist; 243 Appendix Portfolio of Two Algorithms Let us consider that we have N processors and we design a portfolio using n1 processors with algorithm 1 and n2 processors with algorithm2 (N = n1 + n2). Let us consider the random variable: X - the number of backtracks that the portfolio takes to find a solution or prove that a solution doesn’t exist; 244 Appendix Portfolio of Two Algorithms Given N processors, and 245 Appendix Portfolio of Algorithms Given N processors, such that and and the term in the summation is 0 when
Slide 246 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values 219 Branch & Bound Depth First vs. Best bound Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound (standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic Depth-first - often quickly reaches an integer solution (may take longer to produce an overall optimal value) 220 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 221 Depth-first vs. Best-bound (logistics planning) Number of nodes Cumulative Frequencies Depth-First ~50% Best-Bound ~30% 222 Depth-First and Best and Bound do not dominate each other overall. 223 Heavy-tailed behavior of Depth-first 224 Portfolio for heavy-tailed search procedures (2 processors) 0 DF / 2 BB 2 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 225 Portfolio for heavy-tailed search procedures (6 processors) 0 DF / 6 BB 6 DF / 0BB Standard deviation of run time of portfolios Expected run time of portfolios 5 DF / 1BB 3 DF / 3 BB 4 DF / 2 BB Efficient set 226 Portfolio for heavy-tailed search procedures (20 processors) 0 DF / 20 BB 20 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 227 Portfolio for heavy-tailed search procedures (2-20 processors) 228 A portfolio approach can lead to substantial improvements in the expected cost and risk of stochastic algorithms, especially in the presence of heavy-tailed phenomena. 229 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- cuts very long runs --- exploits ultra-short runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 230 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- exploits ultra-short runs --- cuts very long runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 231 IV. CONCLUSIONS 232 Important Themes in OR Linear Programming (Mixed) Integer Programming Exploit Structure e.g., Network Flow Problems Duality very elegant theory in LP sensitivity analysis 233 Opportunities for Integration of AI/OR OR methods: Have focused on tractable representations (LP) Have demonstrated the ability to identify optimal and locally optimal solutions LIMITATION: Restricted to rigid models with limited expressive power AI methods: Richer and more flexible representations,supporting constraint-based reasoning mechanisms as well as mixed initiative frameworks, allowing the human expertise to be in the loop. LIMITATION: Rich representations in general lead to intractable problems CHALLENGE: good representations / fast & good solutions , 234 Opportunities for Integration of AI/OR AI methods are becoming competitive AI methods used to be considered not suitable for realworld scheduling problems. Recent developments have shown they can be competitive. Examples: SAP, Peoplesoft, I2, … -> provide solutions for scheduling combining constraint programming and mathematical programming approaches. ILOG (CP language) has several fielded applications in different scheduling areas; ILOG has integrated a CSP solver with CPLEX. OR people have acknowledge the benefits of combining OR and AI methods 235 Opportunities for Integration of AI/OR Exploiting Duality in CSP frameworks Exploiting Randomization Hybrid Solvers 236 Opportunities for Integration of AI/OR Hybrid Solvers - emerging area of research (CSP+OR); it started with CLP(R), Prolog III and CHIP; ILOG integrates a CSP solver with CPLEX local constraint propagation - local consistency algs global constraint propagation - LP relaxations Only a hybrid approach could prove optimality, e.g.: Hoist scheduling (Rodosek & Wallace 1998) Multicommodity integer network flow problem (Dutch Railways) (McAloon, Tretkoff, Wetzel 1998) 237 Updated version of tutorial slides www.cs.cornell.edu/gomes/ Talks Demos 238 Appendix 239 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. A portfolio has an expected computational cost and a standard deviation, a measure of the dispersion of the computational cost. The standard deviation of the portfolio is a measure of the risk inherent to the portfolio. 240 Portfolio of Algorithms Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 241 Appendix Portfolio of Algorithms Goal: to improve on the performance of the component algorithms in terms of: expected computational cost; risk; Efficient Set or Efficient Frontier - set of portfolios that are the best in terms of expected value and risk. Within the efficient set, in order to minimize the risk, one has to deteriorate the expected value or, in order to improve the expected value, one has to increase the risk. 242 Appendix Portfolio of Two Algorithms Let us consider the random variables: A1 - the number of backtracks that algorithm 1 takes to find a solution or prove that a solution doesn’t exist; A2 - the number of backtracks that algorithm 2 takes to find a solution or prove that a solution doesn’t exist; 243 Appendix Portfolio of Two Algorithms Let us consider that we have N processors and we design a portfolio using n1 processors with algorithm 1 and n2 processors with algorithm2 (N = n1 + n2). Let us consider the random variable: X - the number of backtracks that the portfolio takes to find a solution or prove that a solution doesn’t exist; 244 Appendix Portfolio of Two Algorithms Given N processors, and 245 Appendix Portfolio of Algorithms Given N processors, such that and and the term in the summation is 0 when 246 Preliminary Research on Structure of Search Spaces
Slide 247 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values 219 Branch & Bound Depth First vs. Best bound Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound (standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic Depth-first - often quickly reaches an integer solution (may take longer to produce an overall optimal value) 220 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 221 Depth-first vs. Best-bound (logistics planning) Number of nodes Cumulative Frequencies Depth-First ~50% Best-Bound ~30% 222 Depth-First and Best and Bound do not dominate each other overall. 223 Heavy-tailed behavior of Depth-first 224 Portfolio for heavy-tailed search procedures (2 processors) 0 DF / 2 BB 2 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 225 Portfolio for heavy-tailed search procedures (6 processors) 0 DF / 6 BB 6 DF / 0BB Standard deviation of run time of portfolios Expected run time of portfolios 5 DF / 1BB 3 DF / 3 BB 4 DF / 2 BB Efficient set 226 Portfolio for heavy-tailed search procedures (20 processors) 0 DF / 20 BB 20 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 227 Portfolio for heavy-tailed search procedures (2-20 processors) 228 A portfolio approach can lead to substantial improvements in the expected cost and risk of stochastic algorithms, especially in the presence of heavy-tailed phenomena. 229 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- cuts very long runs --- exploits ultra-short runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 230 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- exploits ultra-short runs --- cuts very long runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 231 IV. CONCLUSIONS 232 Important Themes in OR Linear Programming (Mixed) Integer Programming Exploit Structure e.g., Network Flow Problems Duality very elegant theory in LP sensitivity analysis 233 Opportunities for Integration of AI/OR OR methods: Have focused on tractable representations (LP) Have demonstrated the ability to identify optimal and locally optimal solutions LIMITATION: Restricted to rigid models with limited expressive power AI methods: Richer and more flexible representations,supporting constraint-based reasoning mechanisms as well as mixed initiative frameworks, allowing the human expertise to be in the loop. LIMITATION: Rich representations in general lead to intractable problems CHALLENGE: good representations / fast & good solutions , 234 Opportunities for Integration of AI/OR AI methods are becoming competitive AI methods used to be considered not suitable for realworld scheduling problems. Recent developments have shown they can be competitive. Examples: SAP, Peoplesoft, I2, … -> provide solutions for scheduling combining constraint programming and mathematical programming approaches. ILOG (CP language) has several fielded applications in different scheduling areas; ILOG has integrated a CSP solver with CPLEX. OR people have acknowledge the benefits of combining OR and AI methods 235 Opportunities for Integration of AI/OR Exploiting Duality in CSP frameworks Exploiting Randomization Hybrid Solvers 236 Opportunities for Integration of AI/OR Hybrid Solvers - emerging area of research (CSP+OR); it started with CLP(R), Prolog III and CHIP; ILOG integrates a CSP solver with CPLEX local constraint propagation - local consistency algs global constraint propagation - LP relaxations Only a hybrid approach could prove optimality, e.g.: Hoist scheduling (Rodosek & Wallace 1998) Multicommodity integer network flow problem (Dutch Railways) (McAloon, Tretkoff, Wetzel 1998) 237 Updated version of tutorial slides www.cs.cornell.edu/gomes/ Talks Demos 238 Appendix 239 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. A portfolio has an expected computational cost and a standard deviation, a measure of the dispersion of the computational cost. The standard deviation of the portfolio is a measure of the risk inherent to the portfolio. 240 Portfolio of Algorithms Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 241 Appendix Portfolio of Algorithms Goal: to improve on the performance of the component algorithms in terms of: expected computational cost; risk; Efficient Set or Efficient Frontier - set of portfolios that are the best in terms of expected value and risk. Within the efficient set, in order to minimize the risk, one has to deteriorate the expected value or, in order to improve the expected value, one has to increase the risk. 242 Appendix Portfolio of Two Algorithms Let us consider the random variables: A1 - the number of backtracks that algorithm 1 takes to find a solution or prove that a solution doesn’t exist; A2 - the number of backtracks that algorithm 2 takes to find a solution or prove that a solution doesn’t exist; 243 Appendix Portfolio of Two Algorithms Let us consider that we have N processors and we design a portfolio using n1 processors with algorithm 1 and n2 processors with algorithm2 (N = n1 + n2). Let us consider the random variable: X - the number of backtracks that the portfolio takes to find a solution or prove that a solution doesn’t exist; 244 Appendix Portfolio of Two Algorithms Given N processors, and 245 Appendix Portfolio of Algorithms Given N processors, such that and and the term in the summation is 0 when 246 Preliminary Research on Structure of Search Spaces 247 Fringe of Search Tree
Slide 248 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values 219 Branch & Bound Depth First vs. Best bound Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound (standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic Depth-first - often quickly reaches an integer solution (may take longer to produce an overall optimal value) 220 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 221 Depth-first vs. Best-bound (logistics planning) Number of nodes Cumulative Frequencies Depth-First ~50% Best-Bound ~30% 222 Depth-First and Best and Bound do not dominate each other overall. 223 Heavy-tailed behavior of Depth-first 224 Portfolio for heavy-tailed search procedures (2 processors) 0 DF / 2 BB 2 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 225 Portfolio for heavy-tailed search procedures (6 processors) 0 DF / 6 BB 6 DF / 0BB Standard deviation of run time of portfolios Expected run time of portfolios 5 DF / 1BB 3 DF / 3 BB 4 DF / 2 BB Efficient set 226 Portfolio for heavy-tailed search procedures (20 processors) 0 DF / 20 BB 20 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 227 Portfolio for heavy-tailed search procedures (2-20 processors) 228 A portfolio approach can lead to substantial improvements in the expected cost and risk of stochastic algorithms, especially in the presence of heavy-tailed phenomena. 229 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- cuts very long runs --- exploits ultra-short runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 230 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- exploits ultra-short runs --- cuts very long runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 231 IV. CONCLUSIONS 232 Important Themes in OR Linear Programming (Mixed) Integer Programming Exploit Structure e.g., Network Flow Problems Duality very elegant theory in LP sensitivity analysis 233 Opportunities for Integration of AI/OR OR methods: Have focused on tractable representations (LP) Have demonstrated the ability to identify optimal and locally optimal solutions LIMITATION: Restricted to rigid models with limited expressive power AI methods: Richer and more flexible representations,supporting constraint-based reasoning mechanisms as well as mixed initiative frameworks, allowing the human expertise to be in the loop. LIMITATION: Rich representations in general lead to intractable problems CHALLENGE: good representations / fast & good solutions , 234 Opportunities for Integration of AI/OR AI methods are becoming competitive AI methods used to be considered not suitable for realworld scheduling problems. Recent developments have shown they can be competitive. Examples: SAP, Peoplesoft, I2, … -> provide solutions for scheduling combining constraint programming and mathematical programming approaches. ILOG (CP language) has several fielded applications in different scheduling areas; ILOG has integrated a CSP solver with CPLEX. OR people have acknowledge the benefits of combining OR and AI methods 235 Opportunities for Integration of AI/OR Exploiting Duality in CSP frameworks Exploiting Randomization Hybrid Solvers 236 Opportunities for Integration of AI/OR Hybrid Solvers - emerging area of research (CSP+OR); it started with CLP(R), Prolog III and CHIP; ILOG integrates a CSP solver with CPLEX local constraint propagation - local consistency algs global constraint propagation - LP relaxations Only a hybrid approach could prove optimality, e.g.: Hoist scheduling (Rodosek & Wallace 1998) Multicommodity integer network flow problem (Dutch Railways) (McAloon, Tretkoff, Wetzel 1998) 237 Updated version of tutorial slides www.cs.cornell.edu/gomes/ Talks Demos 238 Appendix 239 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. A portfolio has an expected computational cost and a standard deviation, a measure of the dispersion of the computational cost. The standard deviation of the portfolio is a measure of the risk inherent to the portfolio. 240 Portfolio of Algorithms Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 241 Appendix Portfolio of Algorithms Goal: to improve on the performance of the component algorithms in terms of: expected computational cost; risk; Efficient Set or Efficient Frontier - set of portfolios that are the best in terms of expected value and risk. Within the efficient set, in order to minimize the risk, one has to deteriorate the expected value or, in order to improve the expected value, one has to increase the risk. 242 Appendix Portfolio of Two Algorithms Let us consider the random variables: A1 - the number of backtracks that algorithm 1 takes to find a solution or prove that a solution doesn’t exist; A2 - the number of backtracks that algorithm 2 takes to find a solution or prove that a solution doesn’t exist; 243 Appendix Portfolio of Two Algorithms Let us consider that we have N processors and we design a portfolio using n1 processors with algorithm 1 and n2 processors with algorithm2 (N = n1 + n2). Let us consider the random variable: X - the number of backtracks that the portfolio takes to find a solution or prove that a solution doesn’t exist; 244 Appendix Portfolio of Two Algorithms Given N processors, and 245 Appendix Portfolio of Algorithms Given N processors, such that and and the term in the summation is 0 when 246 Preliminary Research on Structure of Search Spaces 247 Fringe of Search Tree 248 Fractal Dimension
Slide 249 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values 219 Branch & Bound Depth First vs. Best bound Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound (standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic Depth-first - often quickly reaches an integer solution (may take longer to produce an overall optimal value) 220 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 221 Depth-first vs. Best-bound (logistics planning) Number of nodes Cumulative Frequencies Depth-First ~50% Best-Bound ~30% 222 Depth-First and Best and Bound do not dominate each other overall. 223 Heavy-tailed behavior of Depth-first 224 Portfolio for heavy-tailed search procedures (2 processors) 0 DF / 2 BB 2 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 225 Portfolio for heavy-tailed search procedures (6 processors) 0 DF / 6 BB 6 DF / 0BB Standard deviation of run time of portfolios Expected run time of portfolios 5 DF / 1BB 3 DF / 3 BB 4 DF / 2 BB Efficient set 226 Portfolio for heavy-tailed search procedures (20 processors) 0 DF / 20 BB 20 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 227 Portfolio for heavy-tailed search procedures (2-20 processors) 228 A portfolio approach can lead to substantial improvements in the expected cost and risk of stochastic algorithms, especially in the presence of heavy-tailed phenomena. 229 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- cuts very long runs --- exploits ultra-short runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 230 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- exploits ultra-short runs --- cuts very long runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 231 IV. CONCLUSIONS 232 Important Themes in OR Linear Programming (Mixed) Integer Programming Exploit Structure e.g., Network Flow Problems Duality very elegant theory in LP sensitivity analysis 233 Opportunities for Integration of AI/OR OR methods: Have focused on tractable representations (LP) Have demonstrated the ability to identify optimal and locally optimal solutions LIMITATION: Restricted to rigid models with limited expressive power AI methods: Richer and more flexible representations,supporting constraint-based reasoning mechanisms as well as mixed initiative frameworks, allowing the human expertise to be in the loop. LIMITATION: Rich representations in general lead to intractable problems CHALLENGE: good representations / fast & good solutions , 234 Opportunities for Integration of AI/OR AI methods are becoming competitive AI methods used to be considered not suitable for realworld scheduling problems. Recent developments have shown they can be competitive. Examples: SAP, Peoplesoft, I2, … -> provide solutions for scheduling combining constraint programming and mathematical programming approaches. ILOG (CP language) has several fielded applications in different scheduling areas; ILOG has integrated a CSP solver with CPLEX. OR people have acknowledge the benefits of combining OR and AI methods 235 Opportunities for Integration of AI/OR Exploiting Duality in CSP frameworks Exploiting Randomization Hybrid Solvers 236 Opportunities for Integration of AI/OR Hybrid Solvers - emerging area of research (CSP+OR); it started with CLP(R), Prolog III and CHIP; ILOG integrates a CSP solver with CPLEX local constraint propagation - local consistency algs global constraint propagation - LP relaxations Only a hybrid approach could prove optimality, e.g.: Hoist scheduling (Rodosek & Wallace 1998) Multicommodity integer network flow problem (Dutch Railways) (McAloon, Tretkoff, Wetzel 1998) 237 Updated version of tutorial slides www.cs.cornell.edu/gomes/ Talks Demos 238 Appendix 239 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. A portfolio has an expected computational cost and a standard deviation, a measure of the dispersion of the computational cost. The standard deviation of the portfolio is a measure of the risk inherent to the portfolio. 240 Portfolio of Algorithms Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 241 Appendix Portfolio of Algorithms Goal: to improve on the performance of the component algorithms in terms of: expected computational cost; risk; Efficient Set or Efficient Frontier - set of portfolios that are the best in terms of expected value and risk. Within the efficient set, in order to minimize the risk, one has to deteriorate the expected value or, in order to improve the expected value, one has to increase the risk. 242 Appendix Portfolio of Two Algorithms Let us consider the random variables: A1 - the number of backtracks that algorithm 1 takes to find a solution or prove that a solution doesn’t exist; A2 - the number of backtracks that algorithm 2 takes to find a solution or prove that a solution doesn’t exist; 243 Appendix Portfolio of Two Algorithms Let us consider that we have N processors and we design a portfolio using n1 processors with algorithm 1 and n2 processors with algorithm2 (N = n1 + n2). Let us consider the random variable: X - the number of backtracks that the portfolio takes to find a solution or prove that a solution doesn’t exist; 244 Appendix Portfolio of Two Algorithms Given N processors, and 245 Appendix Portfolio of Algorithms Given N processors, such that and and the term in the summation is 0 when 246 Preliminary Research on Structure of Search Spaces 247 Fringe of Search Tree 248 Fractal Dimension 249 Fractal Dimension When plotting the length of a curve as a function of the measuring tool on a log-log plot, one obtains a linear relationship: L - the measured length; s - length of the yardstick; c and d are constants; Mandelbrot introduced the fractal dimension D = d +1; A straight line has D = 1.0; The coast of Britain has fractal dimension 1.22; The higher D the more fractal the curve is.
Slide 250 - 1 Integration of Artificial Intelligence and Operations Research Techniques for Combinatorial Problems Carla P. Gomes Cornell University gomes@cs.cornell.edu Ken McAloon and Carol Tretkoff ILOG {mcaloon,tretkoff}@ilog.com 2 AI, OR, and CS AI OR CS 3 Integration of Artificial Intelligence & Operations Research Techniques AI Representations Constraint Languages Logic Formalisms Object-Oriented Prog. Bayesian Nets Rule Based Systems •  •  • Tools Constraint Propagation Systematic Search Stochastic Search •  •  • Pros / Cons Rich Representations Computational Complexity OR Representations Mathematical Modeling Languages Linear & Non-linear (In)Equalities •  •  • Tools Linear Programming Mixed-Integer Prog. Non-linear Models •  •  • Pros / Cons More Tractable (LP) Primarily Complete Info Limited Representations Combinatorial Problems Planning Scheduling THE CHALLENGE AI OR UNIFY APPROACHES TO: SCALE UP SOLUTIONS HANDLE UNCERTAINTY ANALYZE COMPLEXITY (phase transition) EXPLOIT PROBLEM STRUCTURE INCREASE ROBUSTNESS 4 Outline I. Short Overview of OR II. Disjunctive Programming and Hybrid Solvers III. Exploiting Randomization to Solve Hard Combinatorial Problems IV. Conclusions 5 I. Short OR Overview 6 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 7 Outline 1. Short Overview of OR 2. Constraint Programming 3. Cooperating Solvers 4. Disjunctive Programming 5. Exploiting Randomization to Solve Hard Combinatorial Problems 6. Conclusions 8 Optimization Technology Evolution Dispatch Rules 1960 1970 1980 1990 SA, GA, Tabu CPM PERT Constraint-based Scheduling 1998 1947 Primal Simplex LP Parallel LP/MIP Concurrent Scheduling Interior Point Constraint Propagation Large IPs MIP Shifting Bottleneck First CP Systems Cooperating Solvers (LP/CP) Global constraints Barrier LP Barrier Crossover Dual Simplex Implementation Dual Simplex 9 1. Short OR Overview 10 Outline for Linear Programming and Integer Programming Standard Form of LP and a Simple Example Geometric Interpretation of LP Complexity issues MIP Example: Fast Food Example: Capacitated Warehouse Example: 911 11 An LP Story A factory can produce n products from m parts For product j it needs aij units of part i There are bi units of part i available Each unit of product j sold earns cj Amount of each product to make is unknown xj  0 Each part i determines a constraint ai1 x1 + … + ain xn  bi Obvious solution: do nothing Better: maximize c1 x1 + … + cn xn 12 Standard Forms of LP A linear program (LP) in standard form (Dantzig 1947) max cTx subject to Ax  b x  0 Input data: c (n x 1), A (m x n), b (m x 1). Variables: x (n x 1) 13 Standard Forms of LP // The objective function max c1 x1 + … + cn xn // The constraints subject to a11 x1 + … + a1n xn  b1 ... am1 x1 + … + amn xn  bm x1  0 , … , xn  0 14 Standard Forms of LP In OR emphasis is on optimality Solution means optimal solution Feasible solution means solution in the ordinary sense 15 Standard Forms of LP Interpretation of standard form: xj = amount of product j to make cj = revenue per unit product j bi = available amount of component i aij = units of i used per unit of j produced The constraints “say”:  aijxj =  units of i used by j = units of i used  bi 16 What are models? A model is a data-independent abstraction of a problem A model lets you write down the mathematical representation of a model independently of the data Project Model Data One Problem Instance 17 Products Could be Jewelry Products: Rings and Earrings Components: Gold and Diamonds One ring requires 3 units of Gold, and 1 Diamond One set of earrings requires 2 units of Gold, and 2 Diamonds Total Gold and Diamonds are limited Profit is different for Rings than for Earrings Products = { rings, earrings }; Components = { Gold, Diamonds }; demand = [ [3, 1], [2, 2] ]; stock = [150, 180]; profit = [60, 40]; 18 Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl Components: Nitrogen, Hydrogen, Chlorine One unit of Gas requires 1 unit of Nitrogen, 3 units Hydrogen One unit of Chloride requires 1 unit of Nitrogen, 4 units Hydrogen, and 1 unit of Chlorine Total Nitrogen, Hydrogen, Chlorine is limited Profit is different for Gas than Chloride Products Could be Chemicals Products = { gas, chloride }; Components = { nitrogen, hydrogen, chlorine }; demand = [ [1, 3, 0], [1, 4, 1] ]; stock = [50, 180, 40]; profit = [30, 40]; 19 The Problems Have One Model enum Products ...; enum Components ...; float+ demand[Products, Components] = ...; float+ profit[Products] = ...; float+ stock[Components] = ...; var float+ production[Products]; maximize sum (p in Products) profit[p] * production[p] subject to { forall (c in Components) sum (p in Products) demand[p, c] * production[p] <= stock[c] }; 20 OR Modeling Systems OPL AMPL 2LP AIMMS GAMS MPL ILOG Planner etc 21 The Dual The dual linear program (von Neumann 1947); min yTb subject to yTA  c y  0 Variables y (m x 1) Awesome Symmetry - The dual of the dual is the primal 22 Rows and Columns Exchanged min b1 y1 + … + bm yn subject to a11 y1 + … + am1 ym  c1 ... a1n y1 + … + amn ym  cn y1  0 , … , ym  0 23 Duality Theorem Theorem: min yTb = max cTx Consequence: This turns optimality problem into a feasibility problem in x and y Ax  b x  0 yTA  cT y  0 yTb = cTx Consequence: Enumeration not needed to verify optimality 24 Duality Theorem Sensitivity Analysis Consequence: The solution values y* for the y variables yield the Lagrange multipliers of the primal constraints which measure the rate of change of the objective function with respect to the right hand side bounds b yi * =  Z /  bi where Z is the optimum Reference: McAloon and Tretkoff [1996] Wiley Duality Two different views of the same phenomenon Point vs Set Arc vs Node Momentum vs Position Vector vs Hyperplane Landlord vs Renter 26 Simplex and Barrier The simplex algorithm turns the feasibility problem into a iterative repair process with a powerful evaluation function The barrier method transforms the LP into a system of differential equations that describe a vector field of flow on the polytope 27 Geometric Interpretation of LP X Y Max: X subject to: -X + Y <= 4 X + 4*y <= 36 2*X + y <= 23 X + Y >= 4 Y >= X + 10 (0,4) (4,0) (8,0) (10,3) (4,8) Barrier Simplex 28 Complexity of Linear Programming Simplex Method Worst-case --- exponential (Klee and Minty 72) Practice --- good performance Ellipsoid Method Khachian’s Ellipsoid Method Worst-case --- polynomial Practice --- poor performance 29 Complexity of Linear Programming Interior Point Methods or Barrier Methods “Karmarkar’s” (and variants) Method Worst-case --- polynomial Practice --- good performance 30 Complexity of Linear Programming Despite its worst case exponential time complexity, the simplex method is usually the method of choice since it provides tools for sensitivity analysis and its performance is very competitive in practice. Which method performs best is problem dependent. 31 Success Stories Industrial Planning Given current resources, decide what to produce in what quantity Supply Chain Management Multiperiod planning models that link flow from one period to the next Network Flow How best to route goods across a network 32 Assumptions of Linear Programming Linearity when violated: ( xy = 50) Nonlinear programming Continuity when violated: (x integral) (Mixed) Integer programming 33 Assumptions of Linear Programming - continued No Disjunctive Constraints when violated: (x  100 or x  0) Disjunctive programming Additional 0-1 variables and Big M constraints Certainty when violated: (cost c is a random variable) Stochastic programming 34 Search and MIP In order to deal with variables that must have integer values in the solution, a search must be performed. Mixed Integer Programming problems are combinatorial optimization problems and are NP hard feasibility is NP-Complete verifying optimality is co-NP-Complete 35 MIP and Combinatorial Optimization These problems have been attacked by both the AI and OR communities. In AI, these problems are attacked as CSPs or as Planning Problems. In OR, they are done as MIPs and use linear relaxation to help guide the search. The overriding idea in each case is to limit search. 36 Integer Program: All Integer Points in Region 37 Cut to Create Integer Vertex Integer Vertex 38 Example - Fast Food Question: Is it possible for a male college student to eat at the local fast food outlet and still meet the requirements of a balanced diet? If so, what is the least he can do it for? 39 Nutritional Requirements At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron At least 55 grams of protein At most 3000 milligrams of sodium At most 30% of the calories can come from fat Nutritional information is available from fast food outlets 40 College Student’s Requirements At least 2000 calories a day No more than 3 servings of any one food Milk only with cereal and not as a stand-alone drink 41 Fast Food - MIP Model We will have variables Servk to represent the number of servings of item k in the plan. The variable Servk will have to take an integer value for the solution to be valid. The objective function: Z for cost 42 Fast Food - MIP Model Let foodk,j represent the percent of RDA of nutrient j in a serving of item k The for each nutrient j, we have a constraint  foodk,j Servk  100 k 43 Fast Food - MIP Model Let sodiumk represent the amount of salt in a serving of item k For salt we have the constraint  sodiumk Servk  3000 k Similarly for fat 44 Fast Food - MIP Model Let costk represent the cost of a serving of item k For the objective function we have the defining constraint  costk Servk = Z k 45 Fast Food - Solution With a MIP solver and a way to input these constraints we ask for a solution that makes the variables Servk integral and which minimizes Z 46 MIP Solution Technique What the MIP solver does is to carry out a branch and bound search guided by the linear relaxation the solution to the problem with the integrality requirements relaxed Initialize the global variable best_so_far to 1000 (or something else very big). 47 At a Node Compute a solution to the linear relaxation which minimizes Z yielding z*. Prune this node if z*  best_so_far , If all values of Servk are integral, this is a solution. Set best_so_far = z*. Save this node. 48 Branching at a node Choose a variable Servk whose value s* is not integral. Typical heuristic: most non-integral variable Create two child nodes, add Servk  floor(s*) add Servk  ceil(s*) 49 Good News The linear relaxation can prune nodes before all variables Servk are forced to be integral. Surprisingly often a node “high in the tree” will turn up with all relevant variables integer. Here’s why A solution to the LP is at a vertex A vertex is defined as the simultaneous solution of the equality form of n linearly independent constraints Many of these constraints are integer bounding constraints yielding X = integer 50 Arboreally Speaking Breadth first search is often preferred - it visits the “smallest” number of nodes needed to find and verify the optimal solution - analogous to A* If the linear relaxation is tight | z*linear - z*integral | is relatively small then z*linear is an excellent evaluation function 51 Answer - Fast Food Total cost is 8.71 Buy 3 burgers Buy 2 fries Buy 3 honeys Buy 1 yogurt ... 52 Example - Fixed Cost Warehouses must be rented in order to supply stores and we must decide which to use For each store j we know its monthly demand dj For each warehouse i we know its capacity ki For each warehouse i we know the fixed cost to run it each month fci For each pair i, j we know the monthly cost cij of supplying j from i 53 Example - Fixed Cost Xij is the fraction of store j’s demand met by i Xij  1 Yi is a “fuzzy” boolean it will be 1 if the warehouse is rented 0 if it is not rented Yi  1 54 Example - Fixed Cost Each store must be supplied  X ij = 1 i Warehouse capacity can not be exceeded  dj Xij  ki j Tighter  dj Xij  ki Yi j 55 Example - Fixed Cost Objective function  fci Yi +   cij Xij This yields a MIP with 0-1 variables Yi 56 Branch and Cut: An Enhanced Solution Method Cuts - redundant constraints for the MIP model but not redundant for the linear relaxation Xij  Yi Add at a node if violated by solution to linear relaxation Powerful method - will solve the Imperial College OR lib CW problems very easily 57 Example - Call 911 PCTs answer the phone 24 hours a day, 7 days a week. It is known how many PCTs should be on duty during each of the 168 hours during the week in order to assure the necessary response rate. Workers can arrive at any hour and they work for 8 hours except for a one hour break after 4 hours. 58 Example - Call 911 Each PCT has a work week of 5 days followed by 2 days off. Want to meet the demand with minimal or near-minimal number of PCTs. So need to determine how many PCTs start their work week at each hour h of the week 59 Modeling 911 A continuous variable Pcth will represent the number of workers who start their work week at hour h, 0  h < 168. 60 Modeling 911 A continuous variable Z will represent the objective function  Pcth = Z h There will be a constraint for each hour h to assert that there are enough workers on duty at that time. The rhs of this constraint is bh = the number of workers needed. 61 Modeling 911 For this constraint we need to represent the number of workers who are on duty at time h Certainly, those who start the week at time h are here, as are those who started the week at time h - 1 And so on back to time h - 7 with the exception of those who started at time h - 4 and who are now on break. 62 Modeling 911 This also applies to the previous 4 days. When the smoke clears, we sum over the workers w who are working at time h  Pctw  bh w 63 Call 911 solved with progressive roundoff int b[168] = { // New York City 911 30,24,18,15,14,14,15,25,34,36,38,40, 41,43,46,57,57,59,61,59,55,50,45,38, 32,25,20,17,15,13,17,25,32,35,38,40, 42,43,47,58,57,57,59,57,55,52,47,41, 33,25,20,17,15,13,15,25,32,33,37,39, 42,43,47,57,56,57,57,56,53,50,47,41, 34,27,22,19,16,15,16,25,31,35,37,40, 44,45,48,57,57,56,58,56,53,53,46,41, 34,28,23,19,16,15,17,25,33,37,39,42, 45,47,51,59,58,60,61,61,57,56,57,55, 48,41,35,30,26,20,18,22,26,32,42,46, 49,53,54,56,56,56,59,59,57,57,56,56, 52,46,41,34,29,23,18,19,25,31,36,41, 46,50,52,53,52,53,54,53,50,49,45,40 }; 64 Modeling 911 Subject to these constraint we want to find a solution which makes the Pcth integer and which makes Z small. The naïve approach is to compute the minimal linear solution and to round up all the values of Pcth to the nearest integer. The linear relaxation yields Z = 204.67 “fuzzy workers” but rounding yields a mediocre integral solution of 259 workers. 65 Modeling 911 For this and many other applications, heuristics can be used to develop good solutions Progressive Roundoff - solve the linear relaxation, round up first variable and freeze it, re-solve etc. 66 Solving the Integer Problem main() // Planner Code { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcLinOpt simplex(m); IlcFloatVarArray Pct(m,168,0,1000); IlcFloatArray coeffs(m,168); int i,j,k,h,n; 67 Solving the Integer Problem //  Pctw  bh w for(h=0;h<168;h++) { // for each hour of 168 in week for(j=0;j<168;j++) coeffs[j] = 0; for(k=0;k<5;k++) // for each of 5 days for(j=k*24;j= b[h]); } 68 Solving the Integer Linear Problem IlcFloatVar Z = IlcSum(Pct);// Objective simplex.setObjMin(Z); for(i=0;i<168;i++) { //Progressive roundoff n =ceil(simplex.getCurrentValue(Pct[i])); // Fix variable and re-optimize simplex.add(Pct[i] == n); } m.out() << “Number of Pcts needed is “ << Z << endl; m.end(); } 69 Solution This code finds a solution with 208 workers in a couple of seconds. The optimum is 207. The heuristic works well in part because if there were no lunch breaks, it would find the guaranteed optimal solution [Bartholdi,Ratliffe,Orlin] 70 2. Constraint Programming LP/MIP is Beautiful, except when Variable domain information is important to the search strategy especially critical in scheduling The problem variables range over symbolic entities and there are lots of symmetries timetabling The MIP representation can be too verbose or awkward configuration There are just too many constraints e.g. vehicle routing 72 Mathematical Basis of Constraint Programming (CP) The Constraint Satisfaction Problem: Suppose a finite set of variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X1,…,Xk is a relation R(X1,…,Xk)  D1 x …x Dk. A constraint satisfaction problem (CSP) is given by a finite set of constraints. A solution to a CSP is an assignment of values to all the variables so that the constraints are satisfied. 73 Domain Reduction In CP, each constraint of a CSP is considered as a subproblem and techniques are developed for handling frequently encountered constraints. With each constraint is associated a domain reduction algorithm which reduces the domains of the variables that occur in the constraint. Accelerates convergence toward a solution Detects infeasibility 74 Constraint Propagation The other key issue is communication among the constraints or subproblems. The basic method used is called constraint propagation which links the constraints through their shared variables. The important thing about this setup is that it is very modular and independent of the particular structure of the individual constraints. Monsieur Jordan Phenomenon Like prose, you have been doing constraint propagation all your life. Crossword puzzles Incomplete and so backtracking is needed NY Times Sunday Crossword Optical Illusions Origin: Vision analysis (Marr,Waltz et al) 76 Strengths of Constraint Programming Constraint Programming provides a rich Rich Rich representation language. CP variables naturally represent problem entities and the constraints do not have to be translated into a specific problem format such as MIP or SAT. Opportunity to choose a good heuristic for the solution strategy. 77 Which Method for Which App? Product Mix LP Production Planning MIP Distribution Planning MIP Scheduling Constraint Based Scheduling Dispatching CP Local search Configuration CP Technology Application Linear => Disjunctive Constraints Strategic => Operational Optimization 78 3. Cooperating Solvers First Stop CP/CP 80 Mother of All Examples - N Queens Do we think in terms of queens Where do we place this queen ? Do we think in terms of squares Will this square contain a queen ? These views are dual to one other The Primal View For each queen assign it a square Place this queen in this square ? The Dual View For each square decide whether it will have a queen Place a queen in this square ? 83 The Primal Model In which row do we place q[j] - the queen in column j The constraints q[i] != q[j] q[i] - q[j] != i - j q[i] - q[j] != j - i Note: no alldifferent constraint 84 Yet Another duality - rows vs columns In which column do we place qq[i] the queen in row i The constraints are the same qq[i] != qq[j] qq[i] - qq[j] != i - j qq[i] - qq[j] != j - i 85 The Relationship Can link them as inverse functions: q[qq[i]] = i qq[q[j]] = j The constraint propagation i leaves domain of q[j] iff j leaves domain of qq[i] 86 In this primal/dual model Apply first-fail to q[i] Lo and behold one-third fewer fails (Example from Jean Jordan’s thesis) 87 88 Q Q X X X X X X X X X X X X X X X 89 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X 90 Q Q X X X X X X X X X X X X X X X Q X X X X X X X X X X Q X X X X 91 So … The cooperating primal-dual formulation “captured” the generalized arc consistency of the alldifferent constraint The arc consistency of this global constraint is non-trivial to maintain Network flow algorithms flow goes from values to variables each variable has unit demand and capacity 92 Remarks An IP model will encode the first dual solution Will this square contain a queen x[i][j] = 0 or x[i][j] = 1 A disaster beyond 30 queens network structure on rows and columns lost Another example - sports scheduling 93 Constraints and Indices In IP symbols are represented by indices as opposed to values for variables. nurses, teams Paris-St-Germain plays Manchester United on day k xijk {0-1} to represent “team i plays team j on day k” You can’t put symmetry breaking and other constraints on indices. Second Stop CP/IP 95 CP Is Powerful, But …. Sometimes, inconsistencies can be overlooked X - Y  12 X + Y  10 X in [1..20] Y in [1..20] Domain reduction on each constraint and constraint propagation will not reduce the domains although the system has no linear solution but an LP solver would spot this 96 2 Dimensional Bin Packing Application for the Automobile Industry built by Greg Glockner 97 2 Dimensional Bin Packing The problem here is to put as many small rectangles in a big rectangle with 90 degree rotation allowed. The actual application involves circuit boards There are two complete models, one a CP model and the other an IP model. The CP model directs the search The LP relaxation prunes the search space by detecting infeasible nodes 98 2-D Bin Packing Arrange circuit boards onto raw material Boards may be rotated Use same number of each board Objective: minimize scrap Classic combinatorial optimization problem 99 Solving 2-D Bin Packing Use CP to generate partial solutions (nodes) Restrict placement to reduce fragmentation of blank space Use tight LP to test feasibility If any partial solution is infeasible in the LP, prune the tree immediately CP constraints reduce the tree width LP allows us to prune quickly 100 2 Dimensional Bin Packing As the search tree is traversed, the two models are in sync. Note that the variables used in the 2 models are disjoint The two models are dual to each other The IP sees the model from the point of view of the board, the large rectangle The CP sees the model from the point of view of the small rectangles Solutions are obtained in minutes 101 2DBP: Basic CP Formulation Let (xi, yi) be the location and (wi, hi) be the dimensions of the ith tile Basic constraints: Disjunctive constraints to prevent overlapping tiles xi + wi £ xj Ú yi + hi £ yj Ú xj + wj £ xi Ú yj + hj £ yi Constraints to count the number of each tile type Tile-oriented formulation 102 2DBP: Basic IP Formulation Let xijnt = 1 if tile n of type t is in position (i, j) The constraints are: t n j i x j i x t n x ijnt h j j w i i j j i i t n j i nt j i j i ijnt t t , , , } 1 , 0 { , 1 , 1 , , , : , , , , " Î " £ " = å å + ¢ < + ¢ < £ ¢ £ ¢ ¢ ¢ ¢ ¢ Grid-oriented formulation 103 2DBP: LP Issues The LP is large The LP exhibits significant primal degeneracy The LP exhibits significant dual degeneracy 104 2DBP: LP Issues The simplex algorithm cannot solve the LP There is no way for a MIP solver to solve the IP as such The barrier method can solve the LP 105 2DBP: Summary CP as master problem Orders tiles Places tiles by position, then type Selects tile type by frequency to scatter tiles throughout the bin Uses a one-ply lookahead constraint to limit the position of following tile LP relaxation prunes the CP search space Checks whether the partial solution will lead to an infeasible instance Use idiomatic formulations for CP and IP 106 2DBP: Remarks The CP fixes significant numbers of variables at each node The LP pre-processor greatly simplifies the LP Therefore, the lack of incrementality of the barrier method does not cost us 107 2DBP: Cooperative Algorithm Demo 108 Last Stop Constraint Programming and Local Search cooperation Another example of duality in action 109 CP/LS Parallel machines with set-up times Ready times Dues dates Splittable jobs Rogue machines Objectives meet due dates minimize setup costs 110 Two Phase Cooperation Phase I - the Primal (Work on first objective) Configure and schedule the jobs Use constraint based scheduling 111 Two Phase Cooperation Machines morph into trucks 112 Two Phase Cooperation Phase 2 - the Dual (Work on second objective) Schedule the trucks Use Lin, Lin-Kernighan, tabu etc 113 Parallel Machines: Cooperative Algorithm Demo 114 IC Park Example Hoist Scheduling (Rodosek and Wallace) The original model is an IP The CP model is “the same” CP guides search, LP relaxation and CP share pruning duties No apparent duality 115 Remarks One can get great benefit with CP/IP algorithms CP/CP algorithms and CP/LS algorithms IP/LS is just around the corner IP/IP cooperation is hard because one can’t formulate truly dual views either simply not there or too verbose counterexamples welcome 116 4. DISJUNCTIVE PROGRAMMING 117 Disjunctive Linear Programming An extension of Mixed Integer Programming A union of polyhedral sets (feasible regions) is called a disjunctive set. 118 Disjunctive Set 119 Disjunctive Linear Programming The problem of determining whether the intersection of a family of disjunctive sets is non-empty is called the disjunctive linear programming problem or simply disjunctive programming problem. The solution set of the disjunctive programming problem is Ç È Fij i= 100 or X == 0 Rather than X <= BigM*Y , X >=100*Y, Y a 0-1 variable 121 Solution Set Inside Initial Region 122 Disjunctive Linear Programming Examples - continued Bollapragada, Ghattas and Hooker Truss structure design problem Branches directly on alternatives dictated by Hooke’s Law Wyatt Disjunctive programming and mean absolute deviation models (MAD) for portfolio optimization Extends Bender’s decomposition to disjunctive linear programs 123 Disjunctive Linear Programming continued Balas, Cornuejols and Ceria Generating cuts for disjunctive programming problems. McAloon and Tretkoff Basic mathematical results: Optimization and Computational Logic, Wiley 124 Disjunctive Linear Programming continued Dealing with the disjunctive part requires search. This requires an engine which is not available in MIP packages Also the linear relaxation is not as tight and the evaluation function is not as faithful The solution is to use a CSP solver and an LP based solver in tandem - cooperating solvers Beringer and DeBacker for MIP 125 To Keep It Simple GERALD + DONALD = ROBERT An AI classic Newell and Simon Assignment problem + 1 constraint Surprisingly hard for MIP solvers CPLEX MIP takes 1 minute and 29048 nodes (on Sun Enterprise) to find a feasible integer solution 126 The Disjunctive Program One constraint for the equation 100000 G + … + D = 100000 R + … + T For each variable X among G,…,T X = 0 or X = 1 or … or X = 9 For each pair X, Y X  Y-1 or Y  X-1 127 Solution Set: SOME of the Integer Points in the Region 128 The Twin Variables for Cooperating Solvers Integer variables for the letters 0  g, e, r, a, l, d, o, n, b, t  9 With continuous doppelgangers 0  G, E, R, A, L, D, O, N, B, T  9 129 The Variables One multi-variable constraint on the continuous doppelgangers posted to an LP solver and to the CSP solver 100000 G + 10000 E + 1000 R + … + D + 100000 D + 10000 O + 1000 N + … + D = 100000 R + 10000 O + 1000 B + … + T 130 The Variables One CSP constraint on the integer variables posted to a discrete constraint propagation engine AllDifferent(g, e, r, a, l, d, r, n, b, t ) 131 The Search Bounding information from the discrete variables is passed to the continuous doppelgangers and conversely The branching strategy is guided by the linear relaxation on the continuous variables if there is a non-integral variable X, branch on it X  floor(X*) or X  ceil(X*) 132 The Search If the AllDifferent constraint, the initial bounding constraints and the bounding constraints from branching detect a contradiction on the discrete variables, both sides backtrack If the linear relaxation is made infeasible by the bounding constraints that come from the discrete computation or from branching, both sides backtrack 133 The Search New wrinkle The solution to the linear relaxation might have all variables integral - but the AllDifferent constraint can be violated by this set of values In this case, branch to keep them apart either X  Y - 1 or Y  X - 1 134 The Variables void main() { IlcInitFloat(); IlcManager m(IlcNoEdit); IlcIntVar D(m, 1, 9), O(m, 0, 9), N(m, 0, 9), A(m, 0, 9), L(m, 0, 9), G(m, 1, 9), E(m, 0, 9), R(m, 1, 9), B(m, 0, 9), T(m, 0, 9); IlcIntVarArray vars (m, 10, D, O, N, A, L, G, E, R, B, T); // Continued on next slide 135 The Constraints m.add(IlcAllDiff(vars,IlcWhenValue)); IlcLinOpt simplex(m); simplex.add( 100000*R + 10000*O + 1000*B + 100*E + 10*R + T == 100000*G + 10000*E + 1000*R + 100*A + 10*L + D + 100000*D + 10000*O + 1000*N + 100*A + 10*L + D , IlcTrue // Post to Solver as well ); 136 The Search for solutions m.add(Generate(m,simplex,vars)); // Search strategy if (m.nextSolution()) { // Find a solution m.out() << " solution found " << endl;; } m.printInformation(); m.end(); } 137 Branch if a variable is non-integer ILCGOAL2(Generate, IlcSimplex, simplex, IlcIntVarArray, vars) { IlcInt varIndex = MostNotInteger(vars, simplex); if (varIndex >= 0) // There is a non-integer variable return IlcAnd(IlcTryUpwardFirst(vars[varIndex], simplex), this); 138 Is integer relaxation a solution ? IlcManager m = getManager(); if(m.solve(TestIntegerRelaxation(m,simplex))) return 0; 139 Find two variables with same value IlcInt j; for(i=0;i100000 183 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,... 184 Decay of Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy: 185 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay 186 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay 187 Tail Probabilities (Standard Normal, Cauchy, Levy) 188 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite variance 189 Example of Heavy Tailed Model (Random Walk) Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0. 190 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing 191 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale) 192 Number backtracks (log) 1-F(x) Unsolved fraction => Infinite mean Heavy-tails in QCP Domain 193 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 194 195 196 Heavy Tailed Cost Distribution 197 The Log-Log plot shows a linear relation over many orders of magnitude. This is clear evidence of heavy-tailed behavior. 198 By studying larger problems we discovered that not only does the heavy tail phenomenon occur at the right-hand side of the distribution, but we also observed a high frequency of data points on the left-hand side of the distribution. Right-hand side: non-negligible fraction of very long runs Left-hand side: non-negligible fraction of very short runs 199 70%> 250000 15! Sports Scheduling Number backtracks (log) Cumulative Distribution Function 200 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay Also, heavy tails on left. (High probability of very short runs.) 201 Consequence for algorithm design: Use rapid restarts or parallel / inter-leaved runs Super linear speedups!!! 202 Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup 203 Rapid Restarts work particularly well on hard computational problems because of the Heavy Tailed Phenomena in the run time distribution. RAPID RANDOMIZED RESTARTS strategy avoids the tail on the right and exploits the short runs on the left. Restarts provably eliminate heavy tails (Gomes, Selman & Crato ) 204 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly. 205 Y - does not have Heavy Tails 206 Restarts 70% unsolved 250 ~ 62.5 restarts 1-F(x) Unsolved fraction Number backtracks (log) 207 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000 208 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 209 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 210 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 211 Deterministic Logistics Planning 108 mins. 95 sec. Scheduling 14 411 sec 250 sec (*) not found after 2 days Scheduling 16 ---(*) 1.4 hours Scheduling 18 ---(*) ~18 hrs Circuit Synthesis 1 ---(*) 165sec. Circuit Synthesis 2 ---(*) 17min. Summary Results R 3 212 Rapid Restart Speedup 213 Our results provide the first indication of heavy-tailed distri-butions in a computational model. Overall insight: Randomized tie-breaking with rapid restarts gives powerful search strategy. 214 Heavy-Tailed Distributions in Other Domains Quasigroup Completion Problem Graph Coloring Logistic Planning Circuit Synthesis 215 Algorithm Portfolio Design 216 Motivation The runtime and performance of randomized algorithms can vary dramatically on the same instance and on different instances. Goal: Improve the performance of different algorithms by combining them into a portfolio to exploit their relative strengths. 217 Branch & Bound: Best Bound vs. Depth First Search 218 Branch & Bound (Randomized) Standard OR approach for solving Mixed Integer Programs (MIPs) Solve linear relaxation of MIP Branch on the integer variables for which the solution of the LP relaxation is non-integer: apply a good heuristic (e.g., max infeasibility) for variable selection ( + randomization ) and create two new nodes (floor and ceiling of the fractional value) Once we have found an integer solution, its objective value can be used to prune other nodes, whose relaxations have worse values 219 Branch & Bound Depth First vs. Best bound Critical in performance of Branch & Bound: the way in which the next node to be expanded is selected. Best-bound - select the node with the best LP bound (standard OR approach) ---> this case is equivalent to A*, the LP relaxation provides an admissible search heuristic Depth-first - often quickly reaches an integer solution (may take longer to produce an overall optimal value) 220 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 221 Depth-first vs. Best-bound (logistics planning) Number of nodes Cumulative Frequencies Depth-First ~50% Best-Bound ~30% 222 Depth-First and Best and Bound do not dominate each other overall. 223 Heavy-tailed behavior of Depth-first 224 Portfolio for heavy-tailed search procedures (2 processors) 0 DF / 2 BB 2 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 225 Portfolio for heavy-tailed search procedures (6 processors) 0 DF / 6 BB 6 DF / 0BB Standard deviation of run time of portfolios Expected run time of portfolios 5 DF / 1BB 3 DF / 3 BB 4 DF / 2 BB Efficient set 226 Portfolio for heavy-tailed search procedures (20 processors) 0 DF / 20 BB 20 DF / 0 BB Standard deviation of run time of portfolios Expected run time of portfolios 227 Portfolio for heavy-tailed search procedures (2-20 processors) 228 A portfolio approach can lead to substantial improvements in the expected cost and risk of stochastic algorithms, especially in the presence of heavy-tailed phenomena. 229 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- cuts very long runs --- exploits ultra-short runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 230 Summary of Randomization Considered randomized backtrack search. Showed Heavy-Tailed Distributions. Suggests: Rapid Restart Strategy. --- exploits ultra-short runs --- cuts very long runs Experimentally validated on previously unsolved planning and scheduling problems. Portfolio of Algorithms for cases where no single heuristic dominates 231 IV. CONCLUSIONS 232 Important Themes in OR Linear Programming (Mixed) Integer Programming Exploit Structure e.g., Network Flow Problems Duality very elegant theory in LP sensitivity analysis 233 Opportunities for Integration of AI/OR OR methods: Have focused on tractable representations (LP) Have demonstrated the ability to identify optimal and locally optimal solutions LIMITATION: Restricted to rigid models with limited expressive power AI methods: Richer and more flexible representations,supporting constraint-based reasoning mechanisms as well as mixed initiative frameworks, allowing the human expertise to be in the loop. LIMITATION: Rich representations in general lead to intractable problems CHALLENGE: good representations / fast & good solutions , 234 Opportunities for Integration of AI/OR AI methods are becoming competitive AI methods used to be considered not suitable for realworld scheduling problems. Recent developments have shown they can be competitive. Examples: SAP, Peoplesoft, I2, … -> provide solutions for scheduling combining constraint programming and mathematical programming approaches. ILOG (CP language) has several fielded applications in different scheduling areas; ILOG has integrated a CSP solver with CPLEX. OR people have acknowledge the benefits of combining OR and AI methods 235 Opportunities for Integration of AI/OR Exploiting Duality in CSP frameworks Exploiting Randomization Hybrid Solvers 236 Opportunities for Integration of AI/OR Hybrid Solvers - emerging area of research (CSP+OR); it started with CLP(R), Prolog III and CHIP; ILOG integrates a CSP solver with CPLEX local constraint propagation - local consistency algs global constraint propagation - LP relaxations Only a hybrid approach could prove optimality, e.g.: Hoist scheduling (Rodosek & Wallace 1998) Multicommodity integer network flow problem (Dutch Railways) (McAloon, Tretkoff, Wetzel 1998) 237 Updated version of tutorial slides www.cs.cornell.edu/gomes/ Talks Demos 238 Appendix 239 Portfolio of Algorithms A portfolio of algorithm is a collection of algorithms and / or copies of the same algorithm running interleaved or on different processors. A portfolio has an expected computational cost and a standard deviation, a measure of the dispersion of the computational cost. The standard deviation of the portfolio is a measure of the risk inherent to the portfolio. 240 Portfolio of Algorithms Goal: to improve on the performance of the component algorithms in terms of: expected computational cost “risk” (variance) Efficient Set or Efficient Frontier: set of portfolios that are best in terms of expected value and risk. 241 Appendix Portfolio of Algorithms Goal: to improve on the performance of the component algorithms in terms of: expected computational cost; risk; Efficient Set or Efficient Frontier - set of portfolios that are the best in terms of expected value and risk. Within the efficient set, in order to minimize the risk, one has to deteriorate the expected value or, in order to improve the expected value, one has to increase the risk. 242 Appendix Portfolio of Two Algorithms Let us consider the random variables: A1 - the number of backtracks that algorithm 1 takes to find a solution or prove that a solution doesn’t exist; A2 - the number of backtracks that algorithm 2 takes to find a solution or prove that a solution doesn’t exist; 243 Appendix Portfolio of Two Algorithms Let us consider that we have N processors and we design a portfolio using n1 processors with algorithm 1 and n2 processors with algorithm2 (N = n1 + n2). Let us consider the random variable: X - the number of backtracks that the portfolio takes to find a solution or prove that a solution doesn’t exist; 244 Appendix Portfolio of Two Algorithms Given N processors, and 245 Appendix Portfolio of Algorithms Given N processors, such that and and the term in the summation is 0 when 246 Preliminary Research on Structure of Search Spaces 247 Fringe of Search Tree 248 Fractal Dimension 249 Fractal Dimension When plotting the length of a curve as a function of the measuring tool on a log-log plot, one obtains a linear relationship: L - the measured length; s - length of the yardstick; c and d are constants; Mandelbrot introduced the fractal dimension D = d +1; A straight line has D = 1.0; The coast of Britain has fractal dimension 1.22; The higher D the more fractal the curve is. 250 Heavy-Tailed Behavior vs Non-heavy-tailed behavior