Learning Spark

ISBN-10: 1492050016
ISBN-13: 9781492050018
Series: Learning Spark
Category: Computers
Pages: 400
Language: English
Published: 2020-07-16
Publisher: O'Reilly Media
Authors: Jules S. Damji, Brooke Wenig, Tathagata Das

Description

Data is bigger, arrives faster, and comes in a variety of formats—and it all needs to be processed at scale for analytics or machine learning. But how can you process such varied workloads efficiently? Enter Apache Spark. Updated to include Spark 3.0, this second edition shows data engineers and data scientists why structure and unification in Spark matters. Specifically, this book explains how to perform simple and complex data analytics and employ machine learning algorithms. Through step-by-step walk-throughs, code snippets, and notebooks, you’ll be able to: Learn Python, SQL, Scala, or Java high-level Structured APIs Understand Spark operations and SQL Engine Inspect, tune, and debug Spark operations with Spark configurations and Spark UI Connect to data sources: JSON, Parquet, CSV, Avro, ORC, Hive, S3, or Kafka Perform analytics on batch and streaming data using Structured Streaming Build reliable data pipelines with open source Delta Lake and Spark Develop machine learning pipelines with MLlib and productionize models using MLflow

Get the book

Other editions

Learning Spark: Lightning-Fast Data Analytics
- 2020-07-31
- 300 pages
- Paperback
- O'Reilly Media
Learning Spark
- 2020-07-16
- 400 pages
- Ebook
- "O'Reilly Media, Inc."
Learning Spark: Lightning-Fast Big Data Analysis
- 2015-01-28
- 276 pages
- Paperback
- "O'Reilly Media, Inc."
Learning Spark: Lightning-Fast Big Data Analysis
- 2015-01-28
- 276 pages
- Paperback
- "O'Reilly Media, Inc."

Similar books

Learning Spark SQL
By Aurobindo Sarkar
Design, implement, and deliver successful streaming applications, machine learning pipelines and graph applications using Spark SQL API About This Book Learn about the design and implementation of streaming applications, machine learning ...
Spark: The Definitive Guide: Big Data Processing Made Simple
By Bill Chambers, Matei Zaharia
Learn how to use, deploy, and maintain Apache Spark with this comprehensive guide, written by the creators of the open-source cluster-computing framework.
Learning PySpark
By Denny Lee, Tomasz Drabas
Build data-intensive applications locally and deploy at scale using the combined powers of Python and Spark 2.0 About This Book Learn why and how you can efficiently use Python to process data and build machine learning models in Apache ...
Hands-On Deep Learning with Apache Spark: Build and deploy distributed deep learning applications on Apache Spark
By Guglielmo Iozzia
What you will learnUnderstand the basics of deep learningSet up Apache Spark for deep learningUnderstand the principles of distribution modeling and different types of neural networksObtain an understanding of deep learning ...
Mastering Spark with R: The Complete Guide to Large-Scale Analysis and Modeling
By Javier Luraschi, Kevin Kuo, Edgar Ruiz
This book covers relevant data science topics, cluster computing, and issues that should interest even the most advanced users.
High Performance Spark: Best Practices for Scaling and Optimizing Apache Spark
By Holden Karau, Rachel Warren
With this book, you’ll explore: How Spark SQL’s new interfaces improve performance over SQL’s RDD data structure The choice between data joins in Core Spark and Spark SQL Techniques for getting the most out of standard RDD ...
Machine Learning in Python: Essential Techniques for Predictive Analysis
By Michael Bowles
This book demonstrates how machine learning can be implemented using the more widely used and accessible Python programming language.
Learning Apache Spark 2
By Muhammad Asif Abbasi
This book will be your one-stop solution. Who This Book Is For This guide appeals to big data engineers, analysts, architects, software engineers, even technical managers who need to perform efficient data processing on Hadoop at real time.
Machine Learning with Spark - Second Edition
By Rajdeep Dua, Manpreet Singh Ghotra, Nick Pentreath
Create scalable machine learning applications to power a modern data-driven business using Spark 2.xAbout This Book* Get to the grips with the latest version of Apache Spark* Utilize Spark's machine learning library to implement predictive ...
Machine Learning with Apache Spark Quick Start Guide: Uncover patterns, derive actionable insights, and learn from big data using MLlib
By Jillur Quddus
You have most likely heard the terms big data, artificial intelligence, and machine learning, but now wish to ... with each other in order to ultimately architect and engineer end-to-end data intelligence and machine learning pipelines.