Text analysis in Python for social scientists : prediction and classification
Author(s)
Bibliographic Information
Text analysis in Python for social scientists : prediction and classification
(Cambridge elements, . Elements in quantitative and computational methods for the social sciences)
Cambridge University Press, 2022
- : pbk
Available at 3 libraries
  Aomori
  Iwate
  Miyagi
  Akita
  Yamagata
  Fukushima
  Ibaraki
  Tochigi
  Gunma
  Saitama
  Chiba
  Tokyo
  Kanagawa
  Niigata
  Toyama
  Ishikawa
  Fukui
  Yamanashi
  Nagano
  Gifu
  Shizuoka
  Aichi
  Mie
  Shiga
  Kyoto
  Osaka
  Hyogo
  Nara
  Wakayama
  Tottori
  Shimane
  Okayama
  Hiroshima
  Yamaguchi
  Tokushima
  Kagawa
  Ehime
  Kochi
  Fukuoka
  Saga
  Nagasaki
  Kumamoto
  Oita
  Miyazaki
  Kagoshima
  Okinawa
  Korea
  China
  Thailand
  United Kingdom
  Germany
  Switzerland
  France
  Belgium
  Netherlands
  Sweden
  Norway
  United States of America
Note
Includes bibliographical references (p. [83]-92)
Description and Table of Contents
Description
Text contains a wealth of information about about a wide variety of sociocultural constructs. Automated prediction methods can infer these quantities (sentiment analysis is probably the most well-known application). However, there is virtually no limit to the kinds of things we can predict from text: power, trust, misogyny, are all signaled in language. These algorithms easily scale to corpus sizes infeasible for manual analysis. Prediction algorithms have become steadily more powerful, especially with the advent of neural network methods. However, applying these techniques usually requires profound programming knowledge and machine learning expertise. As a result, many social scientists do not apply them. This Element provides the working social scientist with an overview of the most common methods for text classification, an intuition of their applicability, and Python code to execute them. It covers both the ethical foundations of such work as well as the emerging potential of neural network methods.
Table of Contents
- 1. Introduction
- 2. Ethics, Fairness, and Bias
- 3. Classification
- 4. Text as Input
- 5. Labels
- 6. Train-Dev-Test
- 7. Performance Metrics
- 8. Comparison and Significance Testing
- 9. Overfitting and Regularization
- 10. Model Selection and Other Classifiers
- 11. Model Bias
- 12. Feature Selection
- 13. Structured Prediction
- 14. Neural Networks Background
- 15. Neural Architectures and Models.
by "Nielsen BookData"