Learning Spark: Lightning-Fast Big Data Analysis

Name: Learning Spark: Lightning-Fast Big Data Analysis
Rating: 5 (1 reviews)

ISBN-10: 1449359051
ISBN-13: 9781449359058
Series: Learning Spark
Category: Computers
Pages: 276
Language: English
Published: 2015-01-28
Publisher: "O'Reilly Media, Inc."
Authors: Matei Zaharia, Holden Karau, Andy Konwinski

Description

Data in all domains is getting bigger. How can you work with it efficiently? Recently updated for Spark 1.3, this book introduces Apache Spark, the open source cluster computing system that makes data analytics fast to write and fast to run. With Spark, you can tackle big datasets quickly through simple APIs in Python, Java, and Scala. This edition includes new information on Spark SQL, Spark Streaming, setup, and Maven coordinates. Written by the developers of Spark, this book will have data scientists and engineers up and running in no time. You’ll learn how to express parallel jobs with just a few lines of code, and cover applications from simple batch jobs to stream processing and machine learning. Quickly dive into Spark capabilities such as distributed datasets, in-memory caching, and the interactive shell Leverage Spark’s powerful built-in libraries, including Spark SQL, Spark Streaming, and MLlib Use one programming paradigm instead of mixing and matching tools like Hive, Hadoop, Mahout, and Storm Learn how to deploy interactive, batch, and streaming applications Connect to data sources including HDFS, Hive, JSON, and S3 Master advanced topics like data partitioning and shared variables

Get the book

Other editions

Learning Spark
- 2020-07-16
- 400 pages
- Ebook
- "O'Reilly Media, Inc."
Learning Spark: Lightning-Fast Big Data Analysis
- 2015-01-28
- 276 pages
- Paperback
- "O'Reilly Media, Inc."
Learning Spark: Lightning-Fast Data Analytics
- 2020-07-31
- 300 pages
- Paperback
- O'Reilly Media
Learning Spark
- 2020-07-16
- 400 pages
- Paperback
- O'Reilly Media

Similar books

Spark: The Definitive Guide: Big Data Processing Made Simple
By Bill Chambers, Matei Zaharia
Learn how to use, deploy, and maintain Apache Spark with this comprehensive guide, written by the creators of the open-source cluster-computing framework.
Learning Spark SQL
By Aurobindo Sarkar
Design, implement, and deliver successful streaming applications, machine learning pipelines and graph applications using Spark SQL API About This Book Learn about the design and implementation of streaming applications, machine learning ...
Hands-On Deep Learning with Apache Spark: Build and deploy distributed deep learning applications on Apache Spark
By Guglielmo Iozzia
What you will learnUnderstand the basics of deep learningSet up Apache Spark for deep learningUnderstand the principles of distribution modeling and different types of neural networksObtain an understanding of deep learning ...
Mastering Spark with R: The Complete Guide to Large-Scale Analysis and Modeling
By Javier Luraschi, Kevin Kuo, Edgar Ruiz
This book covers relevant data science topics, cluster computing, and issues that should interest even the most advanced users.
High Performance Spark: Best Practices for Scaling and Optimizing Apache Spark
By Holden Karau, Rachel Warren
With this book, you’ll explore: How Spark SQL’s new interfaces improve performance over SQL’s RDD data structure The choice between data joins in Core Spark and Spark SQL Techniques for getting the most out of standard RDD ...
Machine Learning with Apache Spark Quick Start Guide: Uncover patterns, derive actionable insights, and learn from big data using MLlib
By Jillur Quddus
You have most likely heard the terms big data, artificial intelligence, and machine learning, but now wish to ... with each other in order to ultimately architect and engineer end-to-end data intelligence and machine learning pipelines.
Learning PySpark
By Denny Lee, Tomasz Drabas
Build data-intensive applications locally and deploy at scale using the combined powers of Python and Spark 2.0 About This Book Learn why and how you can efficiently use Python to process data and build machine learning models in Apache ...
Learning Apache Spark 2
By Muhammad Asif Abbasi
This book will be your one-stop solution. Who This Book Is For This guide appeals to big data engineers, analysts, architects, software engineers, even technical managers who need to perform efficient data processing on Hadoop at real time.
Beginning Apache Spark Using Azure Databricks: Unleashing Large Cluster Analytics in the Cloud
By Robert Ilijason
This book explains how the confluence of these pivotal technologies gives you enormous power, and cheaply, when it comes to huge datasets.
Spark in Action: Covers Apache Spark 3 with Examples in Java, Python, and Scala
By Jean-Georges Perrin
Table L.6 Options for ingesting and writing data from/to a database (continued) Option Description truncate When SaveMode.Overwrite is enabled, this option ... customSchema The custom schema to use for reading data from JDBC connectors.