Clinical prediction models : a practical approach to development, validation, and updating
著者
書誌事項
Clinical prediction models : a practical approach to development, validation, and updating
(Statistics for biology and health)
Springer, c2019
2nd ed
大学図書館所蔵 件 / 全6件
-
該当する所蔵館はありません
- すべての絞り込み条件を解除する
注記
First ed.: 2009
Includes bibliographical references (p. 519-553) and index
内容説明・目次
内容説明
The second edition of this volume provides insight and practical illustrations on how modern statistical concepts and regression methods can be applied in medical prediction problems, including diagnostic and prognostic outcomes. Many advances have been made in statistical approaches towards outcome prediction, but a sensible strategy is needed for model development, validation, and updating, such that prediction models can better support medical practice.
There is an increasing need for personalized evidence-based medicine that uses an individualized approach to medical decision-making. In this Big Data era, there is expanded access to large volumes of routinely collected data and an increased number of applications for prediction models, such as targeted early detection of disease and individualized approaches to diagnostic testing and treatment. Clinical Prediction Models presents a practical checklist that needs to be considered for development of a valid prediction model. Steps include preliminary considerations such as dealing with missing values; coding of predictors; selection of main effects and interactions for a multivariable model; estimation of model parameters with shrinkage methods and incorporation of external data; evaluation of performance and usefulness; internal validation; and presentation formatting. The text also addresses common issues that make prediction models suboptimal, such as small sample sizes, exaggerated claims, and poor generalizability.
The text is primarily intended for clinical epidemiologists and biostatisticians. Including many case studies and publicly available R code and data sets, the book is also appropriate as a textbook for a graduate course on predictive modeling in diagnosis and prognosis. While practical in nature, the book also provides a philosophical perspective on data analysis in medicine that goes beyond predictive modeling.
Updates to this new and expanded edition include:
* A discussion of Big Data and its implications for the design of prediction models
* Machine learning issues
* More simulations with missing 'y' values
* Extended discussion on between-cohort heterogeneity
* Description of ShinyApp
* Updated LASSO illustration
* New case studies
目次
Preface viiAcknowledgements xiChapter 1 Introduction 11.1 Diagnosis, prognosis and therapy choice in medicine 11.1.1 Predictions for personalized evidence-based medicine 11.2 Statistical modeling for prediction 51.2.1 Model assumptions 51.2.2 Reliability of predictions: aleatory and epistemic uncertainty 61.2.3 Sample size 61.3 Structure of the book 81.3.1 Part I: Prediction models in medicine 81.3.2 Part II: Developing internally valid prediction models 81.3.3 Part III: Generalizability of prediction models 91.3.4 Part IV: Applications 9Part I: Prediction models in medicine 11Chapter 2 Applications of prediction models 132.1 Applications: medical practice and research 132.2 Prediction models for Public Health 142.2.1 Targeting of preventive interventions 14*2.2.2 Example: prediction for breast cancer 142.3 Prediction models for clinical practice 172.3.1 Decision support on test ordering 17*2.3.2 Example: predicting renal artery stenosis 172.3.3 Starting treatment: the treatment threshold 20*2.3.4 Example: probability of deep venous thrombosis 202.3.5 Intensity of treatment 21*2.3.6 Example: defining a poor prognosis subgroup in cancer 222.3.7 Cost-effectiveness of treatment 232.3.8 Delaying treatment 23*2.3.9 Example: spontaneous pregnancy chances 242.3.10 Surgical decision-making 26*2.3.11 Example: replacement of risky heart valves 272.4 Prediction models for medical research 282.4.1 Inclusion and stratification in a RCT 28*2.4.2 Example: selection for TBI trials 292.4.3 Covariate adjustment in a RCT 302.4.4 Gain in power by covariate adjustment 31*2.4.5 Example: analysis of the GUSTO-III trial 322.4.6 Prediction models and observational studies 322.4.7 Propensity scores 33*2.4.8 Example: statin treatment effects 342.4.9 Provider comparisons 35*2.4.10 Example: ranking cardiac outcome 352.5 Concluding remarks 35Chapter 3 Study design for prediction modeling 373.1 Studies for prognosis 373.1.1 Retrospective designs 37*3.1.2 Example: predicting early mortality in esophageal cancer 373.1.3 Prospective designs 38*3.1.4 Example: predicting long-term mortality in esophageal cancer 393.1.5 Registry data 39*3.1.6 Example: surgical mortality in esophageal cancer 393.1.7 Nested case-control studies 40*3.1.8 Example: perioperative mortality in major vascular surgery 403.2 Studies for diagnosis 413.2.1 Cross-sectional study design and multivariable modeling 41*3.2.2 Example: diagnosing renal artery stenosis 413.2.3 Case-control studies 41*3.2.4 Example: diagnosing acute appendicitis 423.3 Predictors and outcome 423.3.1 Strength of predictors 423.3.2 Categories of predictors 423.3.3 Costs of predictors 433.3.4 Determinants of prognosis 443.3.5 Prognosis in oncology 443.4 Reliability of predictors 453.4.1 Observer variability 45*3.4.2 Example: histology in Barrett's esophagus 453.4.3 Biological variability 463.4.4 Regression dilution bias 46*3.4.5 Example: simulation study on reliability of a binary predictor 463.4.6 Choice of predictors 473.5 Outcome 473.5.1 Types of outcome 473.5.2 Survival endpoints 48*3.5.3 Examples: 5-year relative survival in cancer registries 483.5.4 Composite endpoints 49*3.5.5 Example: composite endpoints in cardiology 493.5.6 Choice of prognostic outcome 493.5.7 Diagnostic endpoints 49*3.5.8 Example: PET scans in esophageal cancer 503.6 Phases of biomarker development 503.7 Statistical power and reliable estimation 513.7.1 Sample size to identify predictor effects 513.7.2 Sample size for reliable modeling 533.7.3 Sample size for reliable validation 553.8 Concluding remarks 55Chapter 4 Statistical models for prediction 574.1 Continuous outcomes 57*4.1.1 Examples of linear regression 584.1.2 Economic outcomes 58*4.1.3 Example: prediction of costs 584.1.4 Transforming the outcome 584.1.5 Performance: explained variation 594.1.6 More flexible approaches 604.2 Binary outcomes 614.2.1 R2 in logistic regression analysis 624.2.2 Calculation of R2 on the log likelihood scale 634.2.3 Models related to logistic regression 654.2.4 Bayes rule 654.2.5 Prediction with Naive Bayes 664.2.6 Calibration and Naive Bayes 67*4.2.7 Logistic regression and Bayes 674.2.8 Machine learning: more flexible approaches 684.2.9 Classification and regression trees 69*4.2.10 Example: mortality in acute MI patients 694.2.11 Advantages and disadvantages of tree models 704.2.12 Trees versus logistic regression modeling 70*4.2.13 Other methods for binary outcomes 714.2.14 Summary on binary outcomes 724.3 Categorical outcomes 734.3.1 Polytomous logistic regression 734.3.2 Example: histology of residual masses 73*4.3.3 Alternative models 75*4.3.4 Comparison of modeling approaches 764.4 Ordinal outcomes 774.4.1 Proportional odds logistic regression 77* 4.4.2 Relevance of the proportional odds assumption in RCTs 784.5 Survival outcomes 804.5.1 Cox proportional hazards regression 804.5.2 Prediction with Cox models 814.5.3 Proportionality assumption 814.5.4 Kaplan-Meier analysis 81*4.5.5 Example: impairment after treatment of leprosy 824.5.6 Parametric survival 82*4.5.7 Example: replacement of risky heart valves 834.5.8 Summary on survival outcomes 834.6 Competing risks 844.6.1 Actuarial and actual risks 844.6.2 Absolute risk and the Fine&Gray model 844.6.3 Example: Prediction of coronary heart disease incidence 854.6.4 Multi-state modeling 864.7 Dynamic predictions 874.7.1 Multi-state models and landmarking 874.7.2 Joint models 874.8 Concluding remarks 88Chapter 5 Overfitting and optimism in prediction models 915.1 Overfitting and optimism 915.1.1 Example: surgical mortality in esophagectomy 925.1.2 Variability within one center 925.1.3 Variability between centers: noise vs. true heterogeneity 935.1.4 Predicting mortality by center: shrinkage 945.2 Overfitting in regression models 955.2.1 Model uncertainty and testimation bias 955.2.2 Other modeling biases 975.2.3 Overfitting by parameter uncertainty 975.2.4 Optimism in model performance 985.2.5 Optimism-corrected performance 995.3 Bootstrap resampling 1005.3.1 Applications of the bootstrap 1015.3.2 Bootstrapping for regression coefficients 1025.3.3 Bootstrapping for prediction: optimism correction 1025.3.4 Calculation of optimism-corrected performance 103*5.3.5 Example: Stepwise selection in 429 patients 1045.4 Cost of data analysis 105*5.4.1 Degrees of freedom of a model 1055.4.2 Practical implications 1055.5 Concluding remarks 106Chapter 6 Choosing between alternative models 1096.1 Prediction with statistical models 1096.1.1 Testing of model assumptions and prediction 1106.1.2 Choosing a type of model 1106.2 Modeling age - outcome relations 111*6.2.1 Age and mortality after acute MI 111*6.2.2 Age and operative mortality 112*6.2.3 Age - outcome relations in other diseases 1156.3 Head-to-head comparisons 1166.3.1 StatLog results 116*6.3.2 Cardiovascular disease prediction comparisons 117*6.3.3 Traumatic brain injury modeling results 1196.4 Concluding remarks 120Part II: Developing valid prediction models 123Checklist for developing valid prediction models 124Chapter 7 Missing values 1257.1 Missing values and prediction research 1257.1.1 Inefficiency of complete case analysis 1267.1.2 Interpretation of CC Analyses 1277.1.3 Missing data mechanisms 1277.1.4 Missing outcome data 1287.1.5 Summary points 1297.2 Prediction under MCAR, MAR and MNAR mechanisms 1307.2.1 Missingness patterns 1307.2.2 Missingness and estimated regression coefficients 1327.2.4 Missingness and estimated performance 1347.3 Dealing with missing values in regression analysis 1357.3.1 Imputation principle 1357.3.2 Simple and more advanced single imputation methods 1367.3.3 Multiple imputation 1377.4 Defining the imputation model 1387.4.1 Types of variables in the imputation model 138*7.4.2 Transformations of variables 1397.4.3 Imputation models for SI 1397.4.4 Summary points 1397.5 Success of imputation under MCAR, MAR and MNAR 1407.5.1 Imputation in a simple model 1407.5.2 Other simulation results 140* 7.5.3 Multiple predictors 1407.6 Guidance to dealing with missing values in prediction research 1427.6.1 Patterns of missingness 1427.6.2 Simple approaches 1437.6.3 More advanced approaches 1437.6.4 Maximum fraction of missing values before omitting a predictor 1437.6.5 Single or multiple imputation for predictor effects? 1447.6.6 Single or multiple imputation for deriving predictions? 1457.6.7 Missings and predictions for new patients 145*7.6.8 Performance across multiple imputed data sets 1467.6.9 Reporting of missing values in prediction research 1467.7 Concluding remarks 1487.7.1 Summary statements 148*7.7.2 Available software and challenges 149Chapter 8 Case study on dealing with missing values 1518.1 Introduction 1518.1.1 Aim of the IMPACT study 1518.1.2 Patient selection 1528.1.3 Potential predictors 1528.1.4 Coding and time dependency of predictors 1538.2 Missing values in the IMPACT study 1538.2.1 Missing values in outcome 1538.2.2 Quantification of missingness of predictors 1548.2.3 Patterns of missingness 1568.3 Imputation of missing predictor values 1598.3.1 Correlations between predictors 1598.3.2 Imputation model 1608.3.3 Distributions of imputed values 160*8.3.4 Multilevel imputation 1618.4 Predictor effect: adjusted analyses 1628.4.1 Adjusted analysis for complete predictors: age and motor score 1638.4.2 Adjusted analysis for incomplete predictors: pupils 1658.5 Predictions: multivariable analyses 165*8.5.1 Multilevel analyses 1668.6 Concluding remarks 166Chapter 9 Coding of categorical and continuous predictors 1699.1 Categorical predictors 1699.1.1 Examples of categorical coding 1709.2 Continuous predictors 171*9.2.1 Examples of continuous predictors 1719.2.2 Categorization of continuous predictors 1729.3 Non-linear functions for continuous predictors 1739.3.1. Polynomials 1739.3.2. Fractional polynomials (FP) 1749.3.3 Splines 175*9.3.4 Example: functional forms with RCS or FP 1769.3.5 Extrapolation and robustness 1769.3.5 Preference for FP or RCS? 1769.4 Outliers and winsorizing 1779.4.1 Example: glucose values and outcome of TBI 1789.5 Interpretation of effects of continuous predictors 180*9.5.1 Example: predictor effects in TBI 1819.6 Concluding remarks 1829.6.1 Software 183Chapter 10 Restrictions on candidate predictors 18510.1 Selection before studying the predictor - outcome relation 18510.1.1 Selection based on subject knowledge 185*10.1.2 Examples: too many candidate predictors 18510.1.3 Meta-analysis for candidate predictors 186*10.1.4 Example: predictors in testicular cancer 18610.1.5 Selection based on distributions 18610.2 Combining similar variables 18710.2.1 Subject knowledge for grouping 18710.2.2 Assessing the equal weights assumption 18810.2.3 Biologically motivated weighting schemes 18910.2.4 Statistical combination 18910.3 Averaging effects 190*10.3.1 Example: Chlamydia trachomatis infection risks 190*10.3.2 Example: acute surgery risk relevant for elective patients? 190*10.4 Case study: family history for prediction of a genetic mutation 19110.4.1 Clinical background and patient data 19110.4.2 Similarity of effects 19110.4.3 CRC and adenoma in a proband 19410.4.5 Full prediction model for mutations 19610.5 Concluding remarks 197Chapter 11 Selection of main effects 19911.1 Predictor selection 19911.1.1 Reduction before modeling 19911.1.2 Reduction while modeling 20011.1.3 Collinearity 20011.1.4 Parsimony 20011.1.5 Non-significant candidate predictors 20111.1.6 Summary points on predictor selection 20111.2 Stepwise selection 20211.2.1 Stepwise selection variants 20211.2.2 Stopping rules in stepwise selection 20211.3 Advantages of stepwise methods 20311.4 Disadvantages of stepwise methods 20411.4.1 Instability of selection 20411.4.2 Testimation: Biased in selected coefficients 206*11.4.3 Testimation: empirical illustrations 20711.4.4 Misspecification of variability and p-values 20811.5 Influence of noise variables 21011.6 Univariate analyses and model specification 21111.6.1 Pros and cons of univariate pre-selection 211*11.6.2 Testing of predictors in a domain 21211.7 Modern selection methods 212*11.7.1 Bootstrapping for selection 212*11.7.2 Bagging and boosting 212*11.7.3 Bayesian model averaging (BMA) 21311.7.4 Shrinkage of regression coefficients to zero 21311.8 Concluding remarks 214Chapter 12 Assumptions in regression models: Additivity and linearity 21712.1 Additivity and interaction terms 21712.1.1 Potential interaction terms to consider 21812.1.2 Interactions with treatment 21812.1.3 Other potential interactions 219*12.1.4 Example: time and survival after valve replacement 22012.2 Selection, estimation and performance with interaction terms 22012.2.1 Example: age interactions in GUSTO-I 22012.2.2 Estimation of interaction terms 22112.2.3 Better prediction with interaction terms? 22212.2.4 Summary points 22312.3 Non-linearity in multivariable analysis 22312.3.1 Multivariable restricted cubic splines (rcs) 22412.3.2 Multivariable fractional polynomials (FP) 22512.3.3 Multivariable splines in gam 22512.4 Example: non-linearity in testicular cancer case study 226*12.4.1 Details of multivariable FP and gam analyses 227*12.4.2 GAM in univariate and multivariable analysis 228*12.4.3 Predictive performance 229*12.4.4 R code for non-linear modeling in testicular cancer example 23012.5 Concluding remarks 23012.5.1 Recommendations 231Chapter 13 Modern estimation methods 23313.1 Predictions from regression and other models 233*13.1.1 Estimation with other modeling approaches 23413.2 Shrinkage 23413.2.1 Uniform shrinkage 23513.2.2 Uniform shrinkage: illustration 23613.3 Penalized estimation 236*13.3.1 Penalized maximum likelihood estimation 23713.3.2 Penalized ML: illustration 238*13.3.3 Optimal penalty by bootstrapping 23813.3.4 Firth regression 239*13.3.5 Firth regression: illustration 239*13.4.1 Estimation of a LASSO model 24013.5 Elastic net 241*13.5.1 Estimation of Elastic Net model 24113.6 Performance after shrinkage 24213.6.1 Shrinkage, penalization, and model selection 24213.7 Concluding remarks 244Chapter 14 Estimation with external information 247Background 24714.1 Combining literature and individual patient data (IPD) 24714.1.1 A global prediction model 248*14.1.2 A global model for traumatic brain injury 24914.1.3 Developing a local prediction model 24914.1.4 Adaptation of univariate coefficients 250*14.1.5 Adaptation method 1 250*14.1.6 Adaptation method 2 251*14.1.7 Estimation of adaptation factors 251*14.1.8 Simulation results 25214.1.9 Performance of the adapted model 25314.2 Case study: prediction model for AAA surgical mortality 25414.2.1 Meta-analysis 25414.2.2 Individual patient data analysis 25514.2.3 Adaptation and clinical presentation 25614.3 Alternative approaches 25714.3.1 Overall calibration 25714.3.2 Stacked regressions 25714.3.3 Bayesian methods: using data priors to regression modeling 25714.3.4 Example: predicting neonatal death 258*14.3.5 Example: aneurysm study 25814.4 Concluding remarks 258Chapter 15 Evaluation of performance 26115.1 Overall performance measures 26115.1.1 Explained variation: R2 26115.1.2 Brier score 26215.1.3 Performance of testicular cancer prediction model 26315.3.4 Assessment of moderate calibration 28315.3.5 Assessment of strong calibration 28315.3.6 Calibration of survival predictions 28415.3.7 Example: calibration in testicular cancer prediction model 285*15.3.8 R code for assessing calibration 28615.3.9 Calibration and discrimination 28615.4 Concluding remarks 28715.4.1 Bibliographic notes 287Chapter 16 Evaluation of clinical usefulness 28916.1 Clinical usefulness 28916.1.1 Intuitive approach to the cutoff 29016.1.2 Decision-analytic approach: benefit vs harm 29016.1.3 Accuracy measures for clinical usefulness 29116.1.4 Decision curve analysis 29216.1.5 Interpreting net benefit in decision curves 29316.1.6 Example: clinical usefulness of prediction in testicular cancer 29516.1.7 Decision curves for testicular cancer example 29616.1.8 Verification bias and clinical usefulness 297*16.1.9 R code 29816.2 Discrimination, calibration, and clinical usefulness 30016.2.1 Discrimination, calibration, and Net Benefit in the testicular cancer case study 30016.2.2 Aims of prediction models and performance measures 30116.2.2 Summary points 30216.3 From prediction models to decision rules 30316.3.1 Performance of decision rules 30316.3.2 Treatment benefit in prognostic subgroups 30516.3.3 Evaluation of classification systems 30516.4 Concluding remarks 306Chapter 17 Validation of prediction models 30917.1 Internal versus external validation, and validity 30917.1.1 Assessment of internal and external validity 31017.2 Internal validation techniques 31117.2.1 Apparent validation 31117.2.3 Cross-validation 31317.2.4 Bootstrap validation 31417.2.5 Internal validation combined with imputation 31517.3 External validation studies 31517.3.1 Temporal validation 316*17.3.2 Example: validation of a model for Lynch syndrome 31617.3.3 Geographic validation 31717.3.4 Fully independent validation 31917.3.5 Reasons for poor validation 32017.4 Concluding remarks 321Chapter 18 Presentation formats 32318.1 Prediction models versus decision rules 32318.2 Clinical prediction models 32518.2.1 Regression formulas 32518.2.2 Confidence intervals for predictions 32618.2.3 Nomograms 32718.2.4 Score chart 32918.2.5 Tables with predictions 33018.2.6 Specific formats 33118.2.7 Black box presentations 33118.3 Case study: clinical prediction model for testicular cancer model 33318.3.1 Regression formula from logistic model 33318.3.2 Nomogram 334*18.3.3 Score chart 33418.3.4 Summary points 33518.4 Clinical decision rules 33518.4.1 Regression tree 33518.4.2 Score chart rule 33518.4.3 Survival groups 33618.4.4 Meta-model 33718.5 Concluding remarks 338Part III: Generalizability of prediction models 341Chapter 19 Patterns of external validity 34319.1 Determinants of external validity 34319.1.1 Case-mix 34319.1.2 Differences in case-mix 34319.1.3 Differences in regression coefficients 34419.2.1 Simulation set-up 34519.2.2 Performance measures 34719.3 Distribution of predictors 34819.3.1 More or less severe case-mix according to X 348*19.3.2 Interpretation of testicular cancer validation 34919.3.3 More or less heterogeneous case-mix according to X 34919.3.4 More or less severe case-mix according to Z 35019.3.5 More or less heterogeneous case-mix according to Z 35119.4 Distribution of observed outcome y 35319.5 Coefficients 35419.5.1 Coefficient of linear predictor < 1 35419.5.2 Coefficients different 35519.6 Summary of patterns of invalidity 35619.6.1 Other scenarios of invalidity 35719.7 Reference values for performance 35819.7.1 Model-based performance: performance if the model is valid 35819.7.2 Performance with refitting 358*19.7.3 Examples: testicular cancer and TBI 359*19.7.4 R code 36019.8 Limited validation sample size 36119.8.1 Uncertainty in validation of performance 361*19.8.2 Estimating standard errors in validation studies 36319.8.3 Summary points 36319.9 Design of external validation studies 36319.9.1 Power of external validation studies 364*19.9.2 Calculating sample sizes for validation studies 36519.9.3 Rules for sample size of validation studies 36619.9.4 Summary points 36719.10 Concluding remarks 368Chapter 20 Updating for a new setting 37120.1 Updating only the intercept 37220.1.1 Simple updating methods 37220.2 Approaches to more extensive updating 37220.2.1 Eight updating methods for predicting binary outcomes 37320.3 Validation and updating in GUSTO-I 37520.3.1 Validity of TIMI-II model for GUSTO-I 37620.3.2 Updating the TIMI-II model for GUSTO-I 37720.3.3 Performance of updated models 378*20.3.4 R code for updating methods 37920.4 Shrinkage and updating 37920.4.1 Shrinkage towards recalibrated values in GUSTO-I 380*20.4.2 R code for shrinkage and penalization in updating 38120.4.4 Bayesian updating 38220.5 Sample size and updating strategy 383*20.5.1 Simulations of sample size, shrinkage, and updating strategy 38420.5.2 A closed test for the choice of updating strategy 38620.6 Validation and updating of tree models 38620.7 Validation and updating of survival models 388*20.7.1 Validation of a simple index for non-Hodgkin's lymphoma 38820.7.2 Updating the prognostic index 38920.7.3 Recalibration for groups by time points 38920.7.4 Recalibration with a Cox or Weibull regression model 39020.7.6 Summary points 39120.8 Continuous updating 392*20.8.1 Precision and updating strategy 392*20.8.2 Continuous updating in GUSTO-I 393*20.8.3 Other dynamic modeling approaches 39420.9 Concluding remarks 396*20.9.1 Further illustrations of updating 397Chapter 21 Updating for multiple settings 40121.1 Differences in outcome 40121.1.1 Testing for calibration-in-the large 401*21.1.2 Illustration of heterogeneity in GUSTO-I 40221.1.3 Updating for better calibration-in-the large 40321.1.4 Empirical Bayes estimates 403*21.1.5 Illustration of updating in GUSTO-I 40421.1.6 Testing and updating of predictor effects 405*21.1.7 Heterogeneity of predictor effects in GUSTO-I 405*21.1.8 R code for random effect analyses in GUSTO-I 40521.2 Provider profiling 40621.2.1 Ranking of centers: the expected rank 407*21.2.2 Example: provider profiling in stroke 408*21.2.4 Estimation and interpreting differences between centers 409*21.2.5 Ranking of centers 410*21.2.6 R code for provider profiling 41121.3 Concluding remarks 412*21.3.1 Further literature 413Part IV: Applications 415Chapter 22 Case study on a prediction of 30-day mortality 41722.1 GUSTO-I study 41722.1.1 Acute myocardial infarction 417*22.1.2 Treatment results from GUSTO-I 41822.1.3 Prognostic modeling in GUSTO-I 41822.2 General considerations of model development 42122.2.1 Research question and intended application 42122.2.2 Outcome and predictors 42122.2.3 Study design and analysis 42122.3 Seven modeling steps in GUSTO-I 42322.3.1 Preliminary 42322.3.2 Coding of predictors 42322.3.3 Model specification 42322.3.4 Model estimation 42322.3.5 Model performance 42422.3.6 Model validation 42422.3.7 Presentation 42522.3.8 Predictions 42622.4 Validity 42822.4.1 Internal validity: overfitting 42822.4.2 External validity: generalizability 42822.4.3 Summary points 42922.5 Translation into clinical practice 42922.5.1 Score chart for choosing thrombolytic therapy 42922.5.2 From predictions to decisions 43022.6 Concluding remarks 432Chapter 23 Case study on survival analysis: prediction of cardiovascular events 43523.1 Prognosis in the SMART study 435*23.1.1 Patients in SMART 43623.2 General considerations in SMART 43823.2.1 Research question and intended application 43823.2.2 Outcome and predictors 43823.2.3 Study design and analysis 43823.3 Preliminary modeling steps in the SMART cohort 44023.3.1 Patterns of missing values 44023.3.2 Imputation of missing values 44123.3.3 R code 44223.4 Coding of predictors 44323.4.1 Extreme values 44323.4.2 Transforming continuous predictors 44423.4.3 Combining predictors with similar effects 44523.4.4 R code 44623.5.1 A full model 44723.5.2 Impact of imputation 44923.5.3 R code for full model and imputation variants 44923.6 Model selection and estimation 45123.6.1 Stepwise selection 45123.6.2 LASSO for selection with imputed data 45223.7 Model performance and internal validation 45323.7.1 Estimation of optimism in performance 45323.7.2 Model presentation 45623.7.3 R code for presentations 45723.8 Concluding remarks 458Chapter 24 Overall lessons and data sets 46124.1 Sample size 46124.1.1 Model selection, estimation, and sample size 46224.1.2 Calibration improvement by penalization 46324.1.3 Poorer performance with more predictors 46424.1.4 Model selection with noise predictors 46524.1.5 Potential solutions 46624.1.6 R code for model selection and penalization 46624.2 Validation 46724.2.1 Examples of internal and external validation 46724.3 Subject matter knowledge versus machine learning 46824.3.1 Exploiting subject matter knowledge 46824.3.2 Machine learning and Big Data 47024.4 Reporting of prediction models and risk of bias assessments 47024.4.1 Reporting guidelines 47024.4.2 Risk of bias assessment 47224.5 Data sets 47324.5.1 GUSTO-I prediction models 47324.5.2 SMART case study 47524.5.3 Testicular cancer case study 47624.5.4 Abdominal aortic aneurysm case study 47824.6 Concluding remarks 481References 483
「Nielsen BookData」 より