Big data processing using Spark in cloud
Author(s)
Bibliographic Information
Big data processing using Spark in cloud
(Studies in big data, v. 43)
Springer, 2019
- : [hardback]
Available at 1 libraries
  Aomori
  Iwate
  Miyagi
  Akita
  Yamagata
  Fukushima
  Ibaraki
  Tochigi
  Gunma
  Saitama
  Chiba
  Tokyo
  Kanagawa
  Niigata
  Toyama
  Ishikawa
  Fukui
  Yamanashi
  Nagano
  Gifu
  Shizuoka
  Aichi
  Mie
  Shiga
  Kyoto
  Osaka
  Hyogo
  Nara
  Wakayama
  Tottori
  Shimane
  Okayama
  Hiroshima
  Yamaguchi
  Tokushima
  Kagawa
  Ehime
  Kochi
  Fukuoka
  Saga
  Nagasaki
  Kumamoto
  Oita
  Miyazaki
  Kagoshima
  Okinawa
  Korea
  China
  Thailand
  United Kingdom
  Germany
  Switzerland
  France
  Belgium
  Netherlands
  Sweden
  Norway
  United States of America
Note
"Corrected publication 2019"--T.p. verso
Includes bibliographical references
Description and Table of Contents
Description
The book describes the emergence of big data technologies and the role of Spark in the entire big data stack. It compares Spark and Hadoop and identifies the shortcomings of Hadoop that have been overcome by Spark. The book mainly focuses on the in-depth architecture of Spark and our understanding of Spark RDDs and how RDD complements big data's immutable nature, and solves it with lazy evaluation, cacheable and type inference. It also addresses advanced topics in Spark, starting with the basics of Scala and the core Spark framework, and exploring Spark data frames, machine learning using Mllib, graph analytics using Graph X and real-time processing with Apache Kafka, AWS Kenisis, and Azure Event Hub. It then goes on to investigate Spark using PySpark and R. Focusing on the current big data stack, the book examines the interaction with current big data tools, with Spark being the core processing layer for all types of data.
The book is intended for data engineers and scientists working on massive datasets and big data technologies in the cloud. In addition to industry professionals, it is helpful for aspiring data processing professionals and students working in big data processing and cloud computing environments.
Table of Contents
Concepts of Big Data and Apache Spark.- Big Data Analysis in Cloud and Machine Learning.- Security Issues and Challenges related to Big Data.- Big Data Security Solutions in Cloud.- Data Science and Analytics.- Big Data Technologies.- Data Analysis with Casandra and Spark.- Spin up the Spark Cluster.- Learn Scala.- IO for Spark.- Processing with Spark.- Spark Data Frames and Spark SQL.- Machine Learning and Advanced Analytics.- Parallel Programming with Spark.- Distributed Graph Processing with Spark.- Real Time Processing with Spark.- Spark in Real World.- Case Studies.
by "Nielsen BookData"