Subset selection in regression

書誌事項

Subset selection in regression

A.J. Miller

(Monographs on statistics and applied probability, 40)

Chapman and Hall, c1990

大学図書館所蔵 件 / 31

この図書・雑誌をさがす

注記

Bibliography: p. [215]-226

Includes index

内容説明・目次

内容説明

Most scientific computing packages contain facilities for stepwise regression and often for 'all subsets' and other techniques for finding 'best-fitting' subsets of regression variables. The application of standard theory can be very misleading in such cases when the model has not been chosen a priori, but from the data. There is widespread awareness that considerable over-fitting occurs and that prediction equations obtained after extensive 'data dredging' often perform poorly when applied to new data. This monograph relates almost entirely to least-squares methods of finding and fitting subsets of regression variables, though most of the concepts are presented in terms of the interpretation and statistical properties of orthogonal projections. An early chapter introduces these methods, which are still not widely known to users of least-squares methods. Existing methods are described for testing whether any useful improvement can be obtained by using any of a set of predictors. Spjotvoll's method for comparing two arbitrary subsets of predictor variables is illustrated and described in detail. When the selected model is the 'best-fitting' in some sense, conventional fitting methods give estimates of regression coefficients which are usually biased in the direction of being too large. The extent of this bias is demonstrated for simple cases. Various ad hoc methods for correcting the bias are discussed (ridge regression, James-Stein shrinkage, jack-knifing, etc.), together with the author's maximum likelihood technique. Areas in which further research is needed are also outlined.

目次

  • Part 1 Objectives: prediction, explanation, elimination or what?
  • how many variables in the prediction formula?
  • alternatives to using subsets
  • "black-box" use of best-subsets techniques. Part 2 Least-squares computations: using sums of squares and products (SSP) matrices
  • orthogonal reduction methods
  • Gauss-Jordan v. orthogonal reduction methods
  • interpretation of projections. Part 3 Finding subsets that fit well: objectives and limitations of this chapter
  • forward selection
  • Efroymson's algorithm
  • backward elimination
  • sequential replacement algorithms
  • generating all subsets
  • using branch-and-bound techniques
  • grouping variables
  • ridge regression and other alternatives. Part 4 Hypothesis testing: is there any information in the remaining variables?
  • is one subset better than another?. Part 5 Estimation of regression coefficients: selection bias
  • choice between two variables
  • selection bias in the general case, and its reductions
  • conditional likelihood estimation
  • the effectiveness of maximum likelihood
  • estimation - summary and further work. Part 6 How many variables?: introduction
  • mean squared errors of prediction (MSEP)
  • cross-validation and the PRESS statistic.

「Nielsen BookData」 より

関連文献: 1件中  1-1を表示

詳細情報

ページトップへ