CiNii Books - Analyzing textual information : from words to meanings through numbers

Author(s)

Bibliographic Information

Analyzing textual information : from words to meanings through numbers

Johannes Ledolter, Lea S. VanderVelde

（Sage publications series, . Quantitative applications in the social sciences ; v. 188）

Sage, c2022

: pbk

Available at / 13 libraries

Kwansei Gakuin University Library上ケ原

: pbk301:2506:1880006610836

OPAC
Gakushuin University Library法経

: pbk301.6A/L49a//N0200635191

OPAC
Kobe Shoin Women's University Library / Kobe Shoin Women's College Library

: pbk301/712580254

OPAC
大東文化大学 60周年記念図書館

: pbk1213946026

OPAC
中央大学中央図書館社情

: pbk301.01/L4700029877883

OPAC
Chukyo University Toyota Library

: pbk301.6/Q 1/1880993546

OPAC
University of Tokyo, Komaba Library社

: pbk〓:F:118:6B1B3014496990

OPAC
東京大学大学院情報学環・学際情報学府図書

: pbkB-b:55:1886613318069

OPAC
Faculty of Letters Library, University of Tokyo社会

: pbk4819708845

OPAC
Tohoku Univ. Main Library本館

: pbk00210039174

OPAC
Doshisha University Library (Imadegawa)

: pbk301.6||Q620||188212300418

OPAC
Meiji Gakuin University Library図

: pbk300.1:Q:1880107055907

OPAC
Ritsumeikan University Main Library

: pbk11003672486

OPAC
No Libraries matched.
Remove all filters.

Search this Book/Journal

Note

Includes bibliographical references and index

Description and Table of Contents

Description

Researchers in the social sciences and beyond are dealing more and more with massive quantities of text data requiring analysis, from historical letters to the constant stream of content in social media. Traditional texts on statistical analysis have focused on numbers, but this book will provide a practical introduction to the quantitative analysis of textual data. Using up-to-date R methods, this book will take readers through the text analysis process, from text mining and pre-processing the text to final analysis. It includes two major case studies using historical and more contemporary text data to demonstrate the practical applications of these methods. Currently, there is no introductory how-to book on textual data analysis with R that is up-to-date and applicable across the social sciences. Code and a variety of additional resources to enrich the use of this book are available on an accompanying website. These resources include data files from the 39th Congress, and also the collection of tweets of President Trump, now no longer available to researchers via Twitter itself.

Table of Contents

Series Editor's Introduction Preface Acknowledgments About the Authors Chapter 1: Introduction 1.1 Text Data 1.2 The Two Applications Considered in This Book 1.3 Introductory Example and Its Analysis Using the R Statistical Software 1.4 The Introductory Example Revisited, Illustrating Concordance and Collocation Using Alternative Software 1.5 Concluding Remarks 1.6 References Chapter 2: A Description of the Studied Text Corpora and A Discussion of Our Modeling Strategy 2.1 Introduction to the Corpora: Selecting the Texts 2.2 Debates of the 39th U.S. Congress, as recorded in the Congressional Globe 2.3 The Territorial Papers of the United States 2.4 Analyzing Text Data: Bottom-Up or Top-Down Analysis 2.5 References Appendix to Chapter 2: The Complete Congressional Record Chapter 3: Preparing Text for Analysis: Text Cleaning and Formatting 3.1 Text Cleaning 3.2 Text Formatting 3.3 Concluding Remarks 3.4 References Chapter 4: Word Distributions: Document-Term Matrices of Word Frequencies and the "Bag of Words" Representation 4.1 Document-Term Matrices of Frequencies 4.2 Displaying Word Frequencies 4.3 Co-Occurrence of Terms in the Same Document 4.4 The Zipf Law: An Interesting Fact About the Distribution of Word Frequencies 4.5 References Chapter 5: Metavariables and Text Analysis Stratified on Metavariables 5.1 The Significance of Stratification and the Importance of Metavariables 5.2 Analysis of the Territorial Papers 5.3 Analysis of Speeches From the 39th Congress 5.4 References Chapter 6: Sentiment Analysis 6.1 Lexicons of Sentiment-Charged Words 6.2 Applying Sentiment Analysis to the Letters of the Territorial Papers 6.3 Using Other Sentiment Dictionaries and the R Software tidytext for Sentiment Analysis 6.4 Concluding Remarks: An Alternative Approach for Sentiment Analysis 6.5 References Chapter 7: Clustering of Documents 7.1 Clustering Documents 7.2 Measures for the Closeness and the Distance of Documents 7.3 Methods for Clustering Documents 7.4 Illustrating Clustering Methods on a Simulated Example 7.5 References Chapter 8: Classification of Documents 8.1 Introduction 8.2 Classification Procedures 8.3 Two Examples Using the Congressional Speech Database 8.4 Concluding Remarks on Authorship Attribution: Commenting on the Field of Stylometry 8.5 References Chapter 9: Modeling Text Data: Topic Models 9.1 Topic Models 9.2 Fitting Topic Models to the Two Corpora Studied in This Book 9.3 References Chapter 10: n-Grams and Other Ways of Analyzing Adjacent Words 10.1 Analysis of Bigrams 10.2 Text Windows to Measure Word Associations Within a Neighborhood of Words and a Discussion of the R Package text2vec 10.3 Illustrating the Use of n-Grams: Speeches of the 39th Congress Chapter 11: Concluding Remarks Appendix: Listing of Website Resources

by "Nielsen BookData"