An introduction to optimization : with applications to machine learning

Bibliographic Information

An introduction to optimization : with applications to machine learning

Edwin K.P. Chong, Wu-Sheng Lu, Stanislaw H. Żak

Wiley, c2024

5th ed

  • : hardback

Available at  / 1 libraries

Search this Book/Journal

Note

Includes bibliographical references (p. 627-634) and index

Description and Table of Contents

Description

An Introduction to Optimization Accessible introductory textbook on optimization theory and methods, with an emphasis on engineering design, featuring MATLAB® exercises and worked examples Fully updated to reflect modern developments in the field, the Fifth Edition of An Introduction to Optimization fills the need for an accessible, yet rigorous, introduction to optimization theory and methods, featuring innovative coverage and a straightforward approach. The book begins with a review of basic definitions and notations while also providing the related fundamental background of linear algebra, geometry, and calculus. With this foundation, the authors explore the essential topics of unconstrained optimization problems, linear programming problems, and nonlinear constrained optimization. In addition, the book includes an introduction to artificial neural networks, convex optimization, multi-objective optimization, and applications of optimization in machine learning. Numerous diagrams and figures found throughout the book complement the written presentation of key concepts, and each chapter is followed by MATLAB® exercises and practice problems that reinforce the discussed theory and algorithms. The Fifth Edition features a new chapter on Lagrangian (nonlinear) duality, expanded coverage on matrix games, projected gradient algorithms, machine learning, and numerous new exercises at the end of each chapter. An Introduction to Optimization includes information on: The mathematical definitions, notations, and relations from linear algebra, geometry, and calculus used in optimization Optimization algorithms, covering one-dimensional search, randomized search, and gradient, Newton, conjugate direction, and quasi-Newton methods Linear programming methods, covering the simplex algorithm, interior point methods, and duality Nonlinear constrained optimization, covering theory and algorithms, convex optimization, and Lagrangian duality Applications of optimization in machine learning, including neural network training, classification, stochastic gradient descent, linear regression, logistic regression, support vector machines, and clustering. An Introduction to Optimization is an ideal textbook for a one- or two-semester senior undergraduate or beginning graduate course in optimization theory and methods. The text is also of value for researchers and professionals in mathematics, operations research, electrical engineering, economics, statistics, and business.

Table of Contents

Preface xv About the Companion Website xviii Part I Mathematical Review 1 1 Methods of Proof and Some Notation 3 1.1 Methods of Proof 3 1.2 Notation 5 Exercises 5 2 Vector Spaces and Matrices 7 2.1 Vector and Matrix 7 2.2 Rank of a Matrix 11 2.3 Linear Equations 16 2.4 Inner Products and Norms 18 Exercises 20 3 Transformations 23 3.1 Linear Transformations 23 3.2 Eigenvalues and Eigenvectors 24 3.3 Orthogonal Projections 26 3.4 Quadratic Forms 27 3.5 Matrix Norms 32 Exercises 35 4 Concepts from Geometry 39 4.1 Line Segments 39 4.2 Hyperplanes and Linear Varieties 39 4.3 Convex Sets 41 4.4 Neighborhoods 43 4.5 Polytopes and Polyhedra 44 Exercises 45 5 Elements of Calculus 47 5.1 Sequences and Limits 47 5.2 Differentiability 52 5.3 The Derivative Matrix 54 5.4 Differentiation Rules 57 5.5 Level Sets and Gradients 58 5.6 Taylor Series 61 Exercises 65 Part II Unconstrained Optimization 67 6 Basics of Set-Constrained and Unconstrained Optimization 69 6.1 Introduction 69 6.2 Conditions for Local Minimizers 70 Exercises 78 7 One-Dimensional Search Methods 87 7.1 Introduction 87 7.2 Golden Section Search 87 7.3 Fibonacci Method 91 7.4 Bisection Method 97 7.5 Newton’s Method 98 7.6 Secant Method 101 7.7 Bracketing 103 7.8 Line Search in Multidimensional Optimization 103 Exercises 105 8 Gradient Methods 109 8.1 Introduction 109 8.2 Steepest Descent Method 110 8.3 Analysis of Gradient Methods 117 Exercises 126 9 Newton’s Method 133 9.1 Introduction 133 9.2 Analysis of Newton’s Method 135 9.3 Levenberg–Marquardt Modification 138 9.4 Newton’s Method for Nonlinear Least Squares 139 Exercises 142 10 Conjugate Direction Methods 145 10.1 Introduction 145 10.2 Conjugate Direction Algorithm 146 10.2.1 Basic Conjugate Direction Algorithm 146 10.3 Conjugate Gradient Algorithm 151 10.4 Conjugate Gradient Algorithm for Nonquadratic Problems 154 Exercises 156 11 Quasi-Newton Methods 159 11.1 Introduction 159 11.2 Approximating the Inverse Hessian 160 11.3 Rank One Correction Formula 162 11.4 DFP Algorithm 166 11.5 BFGS Algorithm 170 Exercises 173 12 Solving Linear Equations 179 12.1 Least-Squares Analysis 179 12.2 Recursive Least-Squares Algorithm 187 12.3 Solution to a Linear Equation with Minimum Norm 190 12.4 Kaczmarz’s Algorithm 191 12.5 Solving Linear Equations in General 194 Exercises 201 13 Unconstrained Optimization and Neural Networks 209 13.1 Introduction 209 13.2 Single-Neuron Training 211 13.3 Backpropagation Algorithm 213 Exercises 222 14 Global Search Algorithms 225 14.1 Introduction 225 14.2 Nelder–Mead Simplex Algorithm 225 14.3 Simulated Annealing 229 14.3.1 Randomized Search 229 14.3.2 Simulated Annealing Algorithm 229 14.4 Particle Swarm Optimization 231 14.4.1 Basic PSO Algorithm 232 14.4.2 Variations 233 14.5 Genetic Algorithms 233 14.5.1 Basic Description 233 14.5.1.1 Chromosomes and Representation Schemes 234 14.5.1.2 Selection and Evolution 234 14.5.2 Analysis of Genetic Algorithms 238 14.5.3 Real-Number Genetic Algorithms 243 Exercises 244 Part III Linear Programming 247 15 Introduction to Linear Programming 249 15.1 Brief History of Linear Programming 249 15.2 Simple Examples of Linear Programs 250 15.3 Two-Dimensional Linear Programs 256 15.4 Convex Polyhedra and Linear Programming 258 15.5 Standard Form Linear Programs 260 15.6 Basic Solutions 264 15.7 Properties of Basic Solutions 267 15.8 Geometric View of Linear Programs 269 Exercises 273 16 Simplex Method 277 16.1 Solving Linear Equations Using Row Operations 277 16.2 The Canonical Augmented Matrix 283 16.3 Updating the Augmented Matrix 284 16.4 The Simplex Algorithm 285 16.5 Matrix Form of the Simplex Method 291 16.6 Two-Phase Simplex Method 294 16.7 Revised Simplex Method 297 Exercises 301 17 Duality 309 17.1 Dual Linear Programs 309 17.2 Properties of Dual Problems 316 17.3 Matrix Games 321 Exercises 324 18 Nonsimplex Methods 331 18.1 Introduction 331 18.2 Khachiyan’s Method 332 18.3 Affine Scaling Method 334 18.3.1 Basic Algorithm 334 18.3.2 Two-Phase Method 337 18.4 Karmarkar’s Method 339 18.4.1 Basic Ideas 339 18.4.2 Karmarkar’s Canonical Form 339 18.4.3 Karmarkar’s Restricted Problem 341 18.4.4 From General Form to Karmarkar’s Canonical Form 342 18.4.5 The Algorithm 345 Exercises 349 19 Integer Linear Programming 351 19.1 Introduction 351 19.2 Unimodular Matrices 351 19.3 The Gomory Cutting-Plane Method 358 Exercises 366 Part IV Nonlinear Constrained Optimization 369 20 Problems with Equality Constraints 371 20.1 Introduction 371 20.2 Problem Formulation 373 20.3 Tangent and Normal Spaces 374 20.4 Lagrange Condition 379 20.5 Second-Order Conditions 387 20.6 Minimizing Quadratics Subject to Linear Constraints 390 Exercises 394 21 Problems with Inequality Constraints 399 21.1 Karush–Kuhn–Tucker Condition 399 21.2 Second-Order Conditions 406 Exercises 410 22 Convex Optimization Problems 417 22.1 Introduction 417 22.2 Convex Functions 419 22.3 Convex Optimization Problems 426 22.4 Semidefinite Programming 431 22.4.1 Linear Matrix Inequalities and Their Properties 431 22.4.2 LMI Solvers 435 22.4.2.1 Finding a Feasible Solution Under LMI Constraints 436 22.4.2.2 Minimizing a Linear Objective Under LMI Constraints 438 22.4.2.3 Minimizing a Generalized Eigenvalue Under LMI Constraints 440 Exercises 442 23 Lagrangian Duality 449 23.1 Overview 449 23.2 Notation 449 23.3 Primal–Dual Pair 450 23.4 General Duality Properties 451 23.4.1 Convexity of Dual Problem 451 23.4.2 Primal Objective in Terms of Lagrangian 451 23.4.3 Minimax Inequality Chain 452 23.4.4 Optimality of Saddle Point 452 23.4.5 Weak Duality 453 23.4.6 Duality Gap 453 23.5 Strong Duality 454 23.5.1 Strong Duality ⇔ Minimax Equals Maximin 454 23.5.2 Strong Duality ⇒ Primal Unconstrained Minimization 455 23.5.3 Strong Duality ⇒ Optimality 455 23.5.4 Strong Duality ⇒ KKT (Including Complementary Slackness) 455 23.5.5 Strong Duality ⇒ Saddle Point 456 23.6 Convex Case 456 23.6.1 Convex Case: KKT ⇒ Strong Duality 456 23.6.2 Convex Case: Regular Optimal Primal ⇒ Strong Duality 457 23.6.3 Convex Case: Slater’s Condition ⇒ Strong Duality 457 23.7 Summary of Key Results 457 Exercises 458 24 Algorithms for Constrained Optimization 459 24.1 Introduction 459 24.2 Projections 459 24.3 Projected Gradient Methods with Linear Constraints 462 24.4 Convergence of Projected Gradient Algorithms 465 24.4.1 Fixed Points and First-Order Necessary Conditions 466 24.4.2 Convergence with Fixed Step Size 468 24.4.3 Some Properties of Projections 469 24.4.4 Armijo Condition 470 24.4.5 Accumulation Points 471 24.4.6 Projections in the Convex Case 472 24.4.7 Armijo Condition in the Convex Case 474 24.4.8 Convergence in the Convex Case 480 24.4.9 Convergence Rate with Line-Search Step Size 481 24.5 Lagrangian Algorithms 483 24.5.1 Lagrangian Algorithm for Equality Constraints 484 24.5.2 Lagrangian Algorithm for Inequality Constraints 486 24.6 Penalty Methods 489 Exercises 495 25 Multiobjective Optimization 499 25.1 Introduction 499 25.2 Pareto Solutions 499 25.3 Computing the Pareto Front 501 25.4 From Multiobjective to Single-Objective Optimization 505 25.5 Uncertain Linear Programming Problems 508 25.5.1 Uncertain Constraints 508 25.5.2 Uncertain Objective Function Coefficients 511 25.5.3 Uncertain Constraint Coefficients 513 25.5.4 General Uncertainties 513 Exercises 513 Part V Optimization in Machine Learning 517 26 Machine Learning Problems and Feature Engineering 519 26.1 Machine Learning Problems 519 26.1.1 Data with Labels and Supervised Learning 519 26.1.2 Data Without Labels and Unsupervised Learning 521 26.2 Data Normalization 522 26.3 Histogram of Oriented Gradients 524 26.4 Principal Component Analysis and Linear Autoencoder 526 26.4.1 Singular Value Decomposition 526 26.4.2 Principal Axes and Principal Components of a Data Set 527 26.4.3 Linear Autoencoder 529 Exercises 530 27 Stochastic Gradient Descent Algorithms 537 27.1 Stochastic Gradient Descent Algorithm 537 27.2 Stochastic Variance Reduced Gradient Algorithm 540 27.3 Distributed Stochastic Variance Reduced Gradient 542 27.3.1 Distributed Learning Environment 542 27.3.2 SVRG in Distributed Optimization 543 27.3.3 Communication Versus Computation 545 27.3.4 Data Security 545 Exercises 546 28 Linear Regression and Its Variants 553 28.1 Least-Squares Linear Regression 553 28.1.1 A Linear Model for Prediction 553 28.1.2 Training the Model 554 28.1.3 Computing Optimal ̂w 554 28.1.4 Optimal Predictor and Performance Evaluation 555 28.1.5 Least-Squares Linear Regression for Data Sets with Vector Labels 556 28.2 Model Selection by Cross-Validation 559 28.3 Model Selection by Regularization 562 Exercises 564 29 Logistic Regression for Classification 569 29.1 Logistic Regression for Binary Classification 569 29.1.1 Least-Squares Linear Regression for Binary Classification 569 29.1.2 Logistic Regression for Binary Classification 570 29.1.3 Interpreting Logistic Regression by Log Error 572 29.1.4 Confusion Matrix for Binary Classification 573 29.2 Nonlinear Decision Boundary via Linear Regression 575 29.2.1 Least-Squares Linear Regression with Nonlinear Transformation 576 29.2.2 Logistic Regression with Nonlinear Transformation 578 29.3 Multicategory Classification 580 29.3.1 One-Versus-All Multicategory Classification 580 29.3.2 Softmax Regression for Multicategory Classification 581 Exercises 584 30 Support Vector Machines 589 30.1 Hinge-Loss Functions 589 30.1.1 Geometric Interpretation of the Linear Model 589 30.1.2 Hinge Loss for Binary Data Sets 590 30.1.3 Hinge Loss for Multicategory Data Sets 592 30.2 Classification by Minimizing Hinge Loss 593 30.2.1 Binary Classification by Minimizing Average Hinge Loss 593 30.2.2 Multicategory Classification by Minimizing E hww or E hcs 594 30.3 Support Vector Machines for Binary Classification 596 30.3.1 Hard-Margin Support Vector Machines 596 30.3.2 Support Vectors 598 30.3.3 Soft-Margin Support Vector Machines 599 30.3.4 Connection to Hinge-Loss Minimization 602 30.4 Support Vector Machines for Multicategory Classification 602 30.5 Kernel Trick 603 30.5.1 Kernels 603 30.5.2 Kernel Trick 604 30.5.3 Learning with Kernels 605 30.5.3.1 Regularized Logistic Regression with Nonlinear Transformation for Binary Classification 605 30.5.3.2 Regularized Hinge-Loss Minimization for Binary Classification 606 Exercises 607 31 K-Means Clustering 611 31.1 K-Means Clustering 611 31.2 K-Means++ forCenterInitialization 615 31.3 Variants of K-Means Clustering 617 31.3.1 K-Means Clustering Based on 1-Norm Regularization 617 31.3.2 PCA-Guided K-Means Clustering 619 31.4 Image Compression by Vector Quantization and K-Means Clustering 622 Exercises 623 References 627 Index 635

by "Nielsen BookData"

Details

  • NCID
    BD06860095
  • ISBN
    • 9781119877639
  • Country Code
    us
  • Title Language Code
    eng
  • Text Language Code
    eng
  • Place of Publication
    Hoboken, N.J.
  • Pages/Volumes
    xviii, 651 p.
  • Size
    26 cm
  • Classification
  • Subject Headings
Page Top