Learning Spark

Learning Spark Review

The Web is getting faster, and the data it delivers is getting bigger. How can you handle everything efficiently? This book introduces Spark, an open source cluster computing system that makes data analytics fast to run and fast to write. You’ll learn how to run programs faster, using primitives for in-memory cluster computing. With Spark, your job can load data into memory and query it repeatedly much quicker than with disk-based systems like Hadoop MapReduce.

Written by the developers of Spark, this book will have you up and running in no time. You’ll learn how to express MapReduce jobs with just a few simple lines of Spark code, instead of spending extra time and effort working with Hadoop’s raw Java API.

Quickly dive into Spark capabilities such as collect, count, reduce, and save
Use one programming paradigm instead of mixing and matching tools such as Hive, Hadoop, Mahout, and S4/Storm
Learn how to run interactive, iterative, and incremental analyses
Integrate with Scala to manipulate distributed datasets like local collections
Tackle partitioning issues, data locality, default hash partitioning, user-defined partitioners, and custom serialization
Use other languages by means of pipe() to achieve the equivalent of Hadoop streaming

Title:Learning Spark
Edition Language:English

Enjoy the book review !

    Some Testimonial About This Book:

  • Alex Ott

    Quite good introduction into the Spark - covers all components, and not so outdated - book covers 1.1 + parts of 1.2...

  • Jacek Laskowski

    Learning Spark from O'Reilly is a fun-Spark-tastic book! It has helped me to pull all the loose strings of knowledge about Spark together. The official documentation, articles, blog posts, the source ...

  • Steve

    Clearly you know if you need this book. And if you do, it's clear and readable. In some ways it's far out of date (it only covers up to version 1.2, now at 2.2 in 2018) but a lot of the concepts you n...

  • Bowei Chen

    This is good introductory book to Spark, which has become a hot big data processing engine since 2015. The book is short and easy to read. ...

  • Francis McGuire

    Out of date better reading, Spark quick start guides....

  • Jascha

    Over the last few years Big Data has gathered an incredible amount of momentum. All this fuzz and buzz resulted in top companies, as well as fearless start-ups, to invest hours and cash in data soluti...

  • Frank Palardy

    Does a good job introducing spark with coding examples. Needed another book to help explain things that were a little out of order....

  • Todd N

    Very good overview of Spark and guided tour through the APIs of its major components (GraphX being the notable exception).I preordered this book and finally got a chance to read it over spring break. ...

  • Michael Koltsov

    This particular book should be included if Spark will eventually get a nice and shiny box version with caps and T-shirts inside. What more can I say? This book is partly written by the creator of Spar...

  • Alberto Tristan Benavides

    I burned through this book over the course of a few days to brush up on my Spark technical chops. It was well-organized, the examples were clear and germane, and overall it’s a pretty solid overview...