Data mining for business intelligence : concepts, techniques, and applications in Microsoft Office Excel with XLMiner

著者

書誌事項

Data mining for business intelligence : concepts, techniques, and applications in Microsoft Office Excel with XLMiner

Galit Shmueli, Nitin R. Patel, Peter C. Bruce

Wiley, c2010

2nd ed

  • : cloth

大学図書館所蔵 件 / 4

この図書・雑誌をさがす

注記

"A John Wiley & Sons, Inc., publication"

"Includes complimentary access to XLMiner" -- Cover

Includes bibliographical references (p. 397) and index

内容説明・目次

内容説明

Praise for the First Edition " full of vivid and thought-provoking anecdotes needs to be read by anyone with a serious interest in research and marketing." Research magazine "Shmueli et al. have done a wonderful job in presenting the field of data mining a welcome addition to the literature." computingreviews.com Incorporating a new focus on data visualization and time series forecasting, Data Mining for Business Intelligence, Second Edition continues to supply insightful, detailed guidance on fundamental data mining techniques. This new edition guides readers through the use of the Microsoft Office Excel add-in XLMiner for developing predictive models and techniques for describing and finding patterns in data. From clustering customers into market segments and finding the characteristics of frequent flyers to learning what items are purchased with other items, the authors use interesting, real-world examples to build a theoretical and practical understanding of key data mining methods, including classification, prediction, and affinity analysis as well as data reduction, exploration, and visualization. The Second Edition now features: * Three new chapters on time series forecasting, introducing popular business forecasting methods including moving average, exponential smoothing methods; regression-based models; and topics such as explanatory vs. predictive modeling, two-level models, and ensembles * A revised chapter on data visualization that now features interactive visualization principles and added assignments that demonstrate interactive visualization in practice * Separate chapters that each treat k-nearest neighbors and Naive Bayes methods * Summaries at the start of each chapter that supply an outline of key topics The book includes access to XLMiner, allowing readers to work hands-on with the provided data. Throughout the book, applications of the discussed topics focus on the business problem as motivation and avoid unnecessary statistical theory. Each chapter concludes with exercises that allow readers to assess their comprehension of the presented material. The final chapter includes a set of cases that require use of the different data mining techniques, and a related Web site features data sets, exercise solutions, PowerPoint slides, and case solutions. Data Mining for Business Intelligence, Second Edition is an excellent book for courses on data mining, forecasting, and decision support systems at the upper-undergraduate and graduate levels. It is also a one-of-a-kind resource for analysts, researchers, and practitioners working with quantitative methods in the fields of business, finance, marketing, computer science, and information technology.

目次

Foreword xvii Preface to the second edition xix Preface to the first edition xxi Acknowledgments xxiii Part I PRELIMINARIES Chapter 1 Introduction 3 1.1 What Is Data Mining? 3 1.2 Where Is Data Mining Used? 4 1.3 Origins of Data Mining 4 1.4 Rapid Growth of Data Mining 5 1.5 Why Are There So Many Different Methods? 6 1.6 Terminology and Notation 7 1.7 Road Maps to This Book 9 Chapter 2 Overview of the Data Mining Process 12 2.1 Introduction 12 2.2 Core Ideas in Data Mining 13 2.3 Supervised and Unsupervised Learning 15 2.4 Steps in Data Mining 15 2.5 Preliminary Steps 17 2.6 Building a Model: Example with Linear Regression 27 2.7 Using Excel for Data Mining 34 Part II DATA EXPLORATION AND DIMENSION REDUCTION Chapter 3 Data Visualization 43 3.1 Uses of Data Visualization 43 3.2 Data Examples 45 3.3 Basic Charts: Bar Charts, Line Graphs, and Scatterplots 45 3.4 Multidimensional Visualization 52 3.5 Specialized Visualizations 63 3.6 Summary ofMajor Visualizations and Operations, According to Data Mining Goal 67 Chapter 4 Dimension Reduction 71 4.1 Introduction 71 4.2 Practical Considerations 72 4.3 Data Summaries 73 4.4 Correlation Analysis . 76 4.5 Reducing the Number of Categories in Categorical Variables 76 4.6 Converting a Categorical Variable to a Numerical Variable 78 4.7 Principal Components Analysis 78 4.8 Dimension Reduction Using Regression Models 87 4.9 Dimension Reduction Using Classification and Regression Trees 88 Part III PERFORMANCE EVALUATION Chapter 5 Evaluating Classification and Predictive Performance 93 5.1 Introduction 93 5.2 Judging Classification Performance 94 5.3 Evaluating Predictive Performance 115 Part IV PREDICTION AND CLASSIFICATION METHODS Chapter 6 Multiple Linear Regression 121 6.1 Introduction 121 6.2 Explanatory versus Predictive Modeling 122 6.3 Estimating the Regression Equation and Prediction 123 6.4 Variable Selection in Linear Regression 127 Chapter 7 k-Nearest Neighbors (k-NN) 137 7.1 k-NN Classifier (Categorical Outcome) 137 7.2 k-NN for a Numerical Response 142 7.3 Advantages and Shortcomings of k-NN Algorithms 144 Chapter 8 Naive Bayes 148 8.1 Introduction 148 8.2 Applying the Full (Exact) Bayesian Classifier 150 8.3 Advantages and Shortcomings of the Naive Bayes Classifier 159 Chapter 9 Classification and Regression Trees 164 9.1 Introduction 164 9.2 Classification Trees 166 9.3 Measures of Impurity 169 9.4 Evaluating the Performance of a Classification Tree 173 9.5 Avoiding Overfitting 179 9.6 Classification Rules from Trees 183 9.7 Classification Trees for More Than Two Classes 185 9.8 RegressionTrees 185 9.9 Advantages, Weaknesses, and Extensions 187 Chapter 10 Logistic Regression 192 10.1 Introduction 192 10.2 Logistic Regression Model 194 10.3 Evaluating Classification Performance 202 10.4 Example of Complete Analysis: Predicting Delayed Flights 206 10.5 Appendix: Logistic Regression for Profiling 211 Chapter 11 Neural Nets 222 11.1 Introduction 222 11.2 Concept and Structure of a Neural Network 223 11.3 Fitting a Network to Data 223 11.4 Required User Input 237 11.5 Exploring the Relationship Between Predictors andResponse 239 11.6 Advantages and Weaknesses of Neural Networks 239 Chapter 12 Discriminant Analysis 243 12.1 Introduction 243 12.2 Distance of an Observation from a Class 246 12.3 Fisher s Linear Classification Functions 247 12.4 Classification Performance of Discriminant Analysis 251 12.5 Prior Probabilities 252 12.6 Unequal Misclassification Costs 252 12.7 Classifying More Than Two Classes 253 12.8 Advantages and Weaknesses 254 Part V MINING RELATIONSHIPS AMONG RECORDS Chapter 13 Association Rules 263 13.1 Introduction 263 13.2 Discovering Association Rules in Transaction Databases 263 13.3 Generating Candidate Rules 265 13.4 Selecting Strong Rules 267 13.5 Summary 275 Chapter 14 Cluster Analysis 279 14.1 Introduction 279 14.2 Measuring Distance Between Two Records 283 14.3 Measuring Distance Between Two Clusters 287 14.4 Hierarchical (Agglomerative) Clustering 290 14.5 Nonhierarchical Clustering: The k-Means Algorithm 295 Part VI FORECASTING TIME SERIES Chapter 15 Handling Time Series 305 15.1 Introduction 305 15.2 Explanatory versus Predictive Modeling 306 15.3 Popular Forecasting Methods in Business 307 15.4 Time Series Components 308 15.5 Data Partitioning 312 Chapter 16 Regression-Based Forecasting 317 16.1 Model with Trend 317 16.2 Model with Seasonality 322 16.3 Model with Trend and Seasonality 324 16.4 Autocorrelation and ARIMA Models 324 Chapter 17 Smoothing Methods 344 17.1 Introduction 344 17.2 MovingAverage 345 17.3 Simple Exponential Smoothing 350 17.4 Advanced Exponential Smoothing 353 Part VII CASES Chapter 18 Cases 367 18.1 Charles Book Club 367 18.2 German Credit 375 18.3 Tayko Software Cataloger 379 18.4 Segmenting Consumers of Bath Soap 383 18.5 Direct-MailFundraising 387 18.6 Catalog Cross Selling 389 18.7 Predicting Bankruptcy 390 18.8 Time Series Case: Forecasting Public Transportation Demand 393 References 397 Index 399

「Nielsen BookData」 より

詳細情報

ページトップへ