Data Algorithms: Recipes for Scaling Up with Hadoop and Spark

Data Algorithms: Recipes for Scaling Up with Hadoop and Spark
ISBN-10
1491906154
ISBN-13
9781491906156
Category
Computers
Pages
778
Language
English
Published
2015-07-13
Publisher
"O'Reilly Media, Inc."
Author
Mahmoud Parsian

Description

If you are ready to dive into the MapReduce framework for processing large datasets, this practical book takes you step by step through the algorithms and tools you need to build distributed MapReduce applications with Apache Hadoop or Apache Spark. Each chapter provides a recipe for solving a massive computational problem, such as building a recommendation system. You’ll learn how to implement the appropriate MapReduce solution with code that you can use in your projects. Dr. Mahmoud Parsian covers basic design patterns, optimization techniques, and data mining and machine learning solutions for problems in bioinformatics, genomics, statistics, and social network analysis. This book also includes an overview of MapReduce, Hadoop, and Spark. Topics include: Market basket analysis for a large set of transactions Data mining algorithms (K-means, KNN, and Naive Bayes) Using huge genomic data to sequence DNA and RNA Naive Bayes theorem and Markov chains for data and market prediction Recommendation algorithms and pairwise document similarity Linear regression, Cox regression, and Pearson correlation Allelic frequency and mining DNA Social network analysis (recommendation systems, counting triangles, sentiment analysis)

Other editions

Similar books

  • Algorithms for Data Science
    By John Chandler, Brian Steele, Swarna Reddy

    This book has three parts:(a) Data Reduction: Begins with the concepts of data reduction, data maps, and information extraction.

  • Algorithms and Data Structures for Massive Datasets
    By Dzejla Medjedovic, Emin Tahirovic

    Massive modern datasets make traditional data structures and algorithms grind to a halt. This fun and practical guide introduces cutting-edge techniques that can reliably handle even the largest distributed datasets....

  • Data Structures and Network Algorithms
    By Robert Endre Tarjan

    Data Structures and Network Algorithms attempts to provide the reader with both a practical understanding of the algorithms, described to facilitate their easy implementation, and an appreciation of the depth and beauty of the field of ...

  • Data Structures and Algorithms in Python
    By Michael T. Goodrich, Roberto Tamassia, Michael H. Goldwasser

    This is a "sister" book to Goodrich & Tamassia's "Data Structures and Algorithms in Java "and Goodrich, Tamassia and Mount's "Data Structures and Algorithms in C++.

  • Data Structures & Their Algorithms
    By Harry R. Lewis, Larry Denenberg

    Using only practically useful techniques, this book teaches methods for organizing, reorganizing, exploring, and retrieving data in digital computers, and the mathematical analysis of those techniques. The authors present analyses...

  • Data Structures and Algorithms
    By Shi Kuo Chang

    Backtracking 5.3.1 . Introduction Backtracking is the exploration of a sequence of potential partial solutions to a problem until a solution is found or a dead end is reached , followed by a regression to an earlier point in the ...

  • Algorithms and Data Structures for Massive Datasets
    By Dzejla Medjedovic, Emin Tahirovic

    Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.

  • An Introduction to Data Structures and Algorithms
    By J.A. Storer

    The material is suitable for undergraduates or first-year graduates who need only review Chapters 1 -4. * This book may be used for a one-semester introductory course (based on Chapters 1-4 and portions of the chapters on algorithm design, ...

  • Disk-Based Algorithms for Big Data
    By Christopher G. Healey

    This includes a review of different in-memory sorting and searching algorithms that build a foundation for more sophisticated on-disk approaches like mergesort, B-trees, and extendible hashing.

  • A Common-Sense Guide to Data Structures and Algorithms: Level Up Your Core Programming Skills
    By Jay Wengrow

    You’ll even encounter a single keyword that can give your code a turbo boost. Jay Wengrow brings to this book the key teaching practices he developed as a web development bootcamp founder and educator.