Practical data science with Python 3 : synthesizing actionable insights from data

Author(s)

    • Varga, Ervin

Bibliographic Information

Practical data science with Python 3 : synthesizing actionable insights from data

Ervin Varga

(Books for professionals by professionals)

Apress, c2019

  • : pbk

Available at  / 1 libraries

Search this Book/Journal

Note

Includes index

Description and Table of Contents

Description

Gain insight into essential data science skills in a holistic manner using data engineering and associated scalable computational methods. This book covers the most popular Python 3 frameworks for both local and distributed (in premise and cloud based) processing. Along the way, you will be introduced to many popular open-source frameworks, like, SciPy, scikitlearn, Numba, Apache Spark, etc. The book is structured around examples, so you will grasp core concepts via case studies and Python 3 code. As data science projects gets continuously larger and more complex, software engineering knowledge and experience is crucial to produce evolvable solutions. You'll see how to create maintainable software for data science and how to document data engineering practices. This book is a good starting point for people who want to gain practical skills to perform data science. All the code will be available in the form of IPython notebooks and Python 3 programs, which allow you to reproduce all analyses from the book and customize them for your own purpose. You'll also benefit from advanced topics like Machine Learning, Recommender Systems, and Security in Data Science. Practical Data Science with Python will empower you analyze data, formulate proper questions, and produce actionable insights, three core stages in most data science endeavors. What You'll Learn Play the role of a data scientist when completing increasingly challenging exercises using Python 3 Work work with proven data science techniques/technologies Review scalable software engineering practices to ramp up data analysis abilities in the realm of Big Data Apply theory of probability, statistical inference, and algebra to understand the data science practices Who This Book Is For Anyone who would like to embark into the realm of data science using Python 3.

Table of Contents

  • Chapter 1. Introduction to Data ScienceNo of pages: 10This chapter introduces the reader to data science, and describes the major stages of working with data (collect, explore, preprocess, visualize, predict, and infer knowledge). It sets the common expectations what constitutes a data science domain. This chapter will elaborate about Anaconda IDE, which will be used in the book. Chapter 2. Data AcquisitionNo of pages: 40This chapter will introduce a reader how to retrieve and store data from/to various data sources: text files (including various formats like CSV, XML and JSON), binary files (including Apache Avro), Web accessible data, relational databases, NoSQL databases, Apache Arrow (as efficient and novel columnar data storage system), multi-modal databases, and network databases. This chapter will also introduce BeautifulSoup to work with XML and HTML. Chapter 3. Basic Data ProcessingNo of pages: 40These are standard Python libraries for scientific computing and processing data. NumPy encompasses all sorts of data structures required during data analysis. Here, we will provide examples that will illuminate the importance of sophisticated frameworks, and reuse based software engineering in the realm of data science. Chapter 4. Documenting WorkNo of pages: 20This chapter introduces the most popular computing environment for data analysis. It makes sharing of results between data scientist possible in an easily reproducible manner. Chapter 5. Transformation and Packaging of DataNo of pages: 30This chapter illuminates a critical data science framework that is built upon NumPy. It provides excellent data structures for handling data frames and series. Chapter 6. VisualizationNo of pages: 40This chapter introduces various ways to visualize data
  • summary statistics or tabular representations are of limited value in exploring data. The following frameworks will the topic of this chapter: matplotlib, glueviz, Bokeh, and orange3. Visualization is important both while doing exploratory analysis as well as when generating effective reports. Chapter 7. Prediction and InferenceNo of pages: 50This chapter will talk about all techniques and technologies to properly scale data science efforts. It will teach readers how to create systems, that may formulate answers on unseen data, or find hidden patterns in data. It will elaborate about supervised, unsupervised, deep, and reinforcement learning methods. Moreover, it will introduce Apache Spark with MLib (both in batch and stream modes) as well as TensorFlow. The following frameworks will also be the topic of this chapter: XGBoost, sci-kit learn and Keras with PyTorch. Chapter 8. Network AnalysisNo of pages: 40This chapter explores the ways to analyze complex networks and graphs. This chapter will introduce Apache Spark GraphX, Apache Giraph, and NetworkX. This chapter will also introduce spectral graph analysis, which is an interesting approximate, non-linear, and non-parametric machine learning method. Chapter 9. Data Science Process EngineeringNo of pages: 20This chapter will elaborate how to share and customize data science practices/methods used by teams via OMG Essence. Chapter 10. Multi-agent Systems, Game Theory and Machine LearningNumber of pages: 30This chapter explores advanced data-oriented applications, where data are produced and consumed by self-governed intelligent agents. The chapter introduces the reader to the concept of multi-agent systems, game theoretic methods and models as well as associated learning algorithms. Chapter 11. Probabilistic Graphical ModelsNumber of pages: 30This chapter explains the most sophisticated form of a graph structure to model many advanced data science problems. Nodes in the graph denote random variables, while the links represent relations between those variables. This chapter equips the reader with a method that may be used when simpler solutions aren't satisfactory. Chapter 12. Security in Data ScienceNumber of pages: 20This chapter presents techniques to anonymize data, and to deal with situations when learning methods must cope with adversarial modifications (a.k.a. adversarial machine learning). This chapter also talks about ways to protect data both in transit and in rest. Appendix A - Crash Course in Python 3No of pages: 20This chapter will briefly teach readers about Python 3, and explain why Python 3 is a perfect choice for doing data science.

by "Nielsen BookData"

Related Books: 1-1 of 1

Details

Page Top