Spark in Action

Spark in Action
Author :
Publisher : Simon and Schuster
Total Pages : 574
Release :
ISBN-10 : 9781638351306
ISBN-13 : 1638351309
Rating : 4/5 (309 Downloads)

Book Synopsis Spark in Action by : Jean-Georges Perrin

Download or read book Spark in Action written by Jean-Georges Perrin and published by Simon and Schuster. This book was released on 2020-05-12 with total page 574 pages. Available in PDF, EPUB and Kindle. Book excerpt: Summary The Spark distributed data processing platform provides an easy-to-implement tool for ingesting, streaming, and processing data from any source. In Spark in Action, Second Edition, you’ll learn to take advantage of Spark’s core features and incredible processing speed, with applications including real-time computation, delayed evaluation, and machine learning. Spark skills are a hot commodity in enterprises worldwide, and with Spark’s powerful and flexible Java APIs, you can reap all the benefits without first learning Scala or Hadoop. Foreword by Rob Thomas. About the technology Analyzing enterprise data starts by reading, filtering, and merging files and streams from many sources. The Spark data processing engine handles this varied volume like a champ, delivering speeds 100 times faster than Hadoop systems. Thanks to SQL support, an intuitive interface, and a straightforward multilanguage API, you can use Spark without learning a complex new ecosystem. About the book Spark in Action, Second Edition, teaches you to create end-to-end analytics applications. In this entirely new book, you’ll learn from interesting Java-based examples, including a complete data pipeline for processing NASA satellite data. And you’ll discover Java, Python, and Scala code samples hosted on GitHub that you can explore and adapt, plus appendixes that give you a cheat sheet for installing tools and understanding Spark-specific terms. What's inside Writing Spark applications in Java Spark application architecture Ingestion through files, databases, streaming, and Elasticsearch Querying distributed datasets with Spark SQL About the reader This book does not assume previous experience with Spark, Scala, or Hadoop. About the author Jean-Georges Perrin is an experienced data and software architect. He is France’s first IBM Champion and has been honored for 12 consecutive years. Table of Contents PART 1 - THE THEORY CRIPPLED BY AWESOME EXAMPLES 1 So, what is Spark, anyway? 2 Architecture and flow 3 The majestic role of the dataframe 4 Fundamentally lazy 5 Building a simple app for deployment 6 Deploying your simple app PART 2 - INGESTION 7 Ingestion from files 8 Ingestion from databases 9 Advanced ingestion: finding data sources and building your own 10 Ingestion through structured streaming PART 3 - TRANSFORMING YOUR DATA 11 Working with SQL 12 Transforming your data 13 Transforming entire documents 14 Extending transformations with user-defined functions 15 Aggregating your data PART 4 - GOING FURTHER 16 Cache and checkpoint: Enhancing Spark’s performances 17 Exporting data and building full data pipelines 18 Exploring deployment


Spark in Action Related Books

Spark in Action
Language: en
Pages: 574
Authors: Jean-Georges Perrin
Categories: Computers
Type: BOOK - Published: 2020-05-12 - Publisher: Simon and Schuster

GET EBOOK

Summary The Spark distributed data processing platform provides an easy-to-implement tool for ingesting, streaming, and processing data from any source. In Spar
Spark in Action
Language: en
Pages: 0
Authors: Petar Zecevic
Categories: Computers
Type: BOOK - Published: 2016-11-26 - Publisher: Manning

GET EBOOK

Summary Spark in Action teaches you the theory and skills you need to effectively handle batch and streaming data using Spark. Fully updated for Spark 2.0. Purc
Learning Spark
Language: en
Pages: 400
Authors: Jules S. Damji
Categories: Computers
Type: BOOK - Published: 2020-07-16 - Publisher: O'Reilly Media

GET EBOOK

Data is bigger, arrives faster, and comes in a variety of formats—and it all needs to be processed at scale for analytics or machine learning. But how can you
Advanced Analytics with Spark
Language: en
Pages: 290
Authors: Sandy Ryza
Categories: Computers
Type: BOOK - Published: 2015-04-02 - Publisher: "O'Reilly Media, Inc."

GET EBOOK

In this practical book, four Cloudera data scientists present a set of self-contained patterns for performing large-scale data analysis with Spark. The authors
Spark: The Definitive Guide
Language: en
Pages: 594
Authors: Bill Chambers
Categories: Computers
Type: BOOK - Published: 2018-02-08 - Publisher: "O'Reilly Media, Inc."

GET EBOOK

Learn how to use, deploy, and maintain Apache Spark with this comprehensive guide, written by the creators of the open-source cluster-computing framework. With