Creating good data : a guide to dataset structure and data representation

Author(s)

    • Foxwell, Harry J.

Bibliographic Information

Creating good data : a guide to dataset structure and data representation

Harry J. Foxwell

(Books for professionals by professionals)

Apress, c2020

  • : pbk

Available at  / 1 libraries

Search this Book/Journal

Note

Includes bibliographical references and index

Description and Table of Contents

Description

Create good data from the start, rather than fixing it after it is collected. By following the guidelines in this book, you will be able to conduct more effective analyses and produce timely presentations of research data. Data analysts are often presented with datasets for exploration and study that are poorly designed, leading to difficulties in interpretation and to delays in producing meaningful results. Much data analytics training focuses on how to clean and transform datasets before serious analyses can even be started. Inappropriate or confusing representations, unit of measurement choices, coding errors, missing values, outliers, etc., can be avoided by using good dataset design and by understanding how data types determine the kinds of analyses which can be performed. This book discusses the principles and best practices of dataset creation, and covers basic data types and their related appropriate statistics and visualizations. A key focus of the book is why certain data types are chosen for representing concepts and measurements, in contrast to the typical discussions of how to analyze a specific data type once it has been selected. What You Will Learn Be aware of the principles of creating and collecting data Know the basic data types and representations Select data types, anticipating analysis goals Understand dataset structures and practices for analyzing and sharing Be guided by examples and use cases (good and bad) Use cleaning tools and methods to create good data Who This Book Is For Researchers who design studies and collect data and subsequently conduct and report the results of their analyses can use the best practices in this book to produce better descriptions and interpretations of their work. In addition, data analysts who explore and explain data of other researchers will be able to create better datasets.

Table of Contents

Introduction Goal: The problem of dataset cleaning and why better design is needed Who this book is for Chapter 1: Basic Data Types Goal: understanding data types Nominal, ordinal, interval, ratio, other How/why to choose specific representations Chapter 2: Planning Your Data Collection Goal: preventive action, avoiding data creation errors Anticipating your required analysis The goals of descriptive statistics and visualizations The goals of relationship statistics and visualizations Independent and dependent variables Chapter 3: Dataset Structures Goal: Understanding how to structure/store data Types of datasets .csv, SQL, Excel, Web, JSON, Sharing data (open formats) Managing datasets Chapter 4: Data Collection Issues Goal: Understanding how to collect data Understand and avoid Bias Sampling Chapter 5: Examples and Use Cases Goal: Illustrate good & not so good datasets Chapter 6: Tools for Dataset Cleaning Goal: still need some data cleanup? here's some help Data cleaning using R, Python, commercial tools (e.g., Tableau) Annotated References Goal: include helpful data design and cleaning references

by "Nielsen BookData"

Related Books: 1-1 of 1

Details

Page Top