In the age of the Internet of Things and social media platforms, huge amounts of digital data are generated by and collected from many sources, including sensors, mobile devices, wearable trackers and security cameras. These data, commonly referred to as big data, are challenging current storage, processing and analysis capabilities. New models, languages, systems and algorithms continue to be developed to effectively collect, store, analyze and learn from big data.
Programming Big Data Applications introduces and discusses models, programming frameworks and algorithms to process and analyze large amounts of data. In particular, the book provides an in-depth description of the properties and mechanisms of the main programming paradigms for big data analysis, including MapReduce, workflow, BSP, message passing, and SQL-like. Through programming examples it also describes the most used frameworks for big data analysis like Hadoop, Spark, MPI, Hive and Storm. Each of the different systems is discussed and compared, highlighting their main features, their diffusion (both within their community of developers and among users), and their main advantages and disadvantages in implementing big data analysis applications.
Contents:
Preface
About the Authors
Acknowledgments
List of Figures
List of Tables
Introduction:
- Motivation and Goals
- Main Topics
- Audience and Organization
- Online Resources
Big Data Concepts:
- Big Data Principles and Features
- Data Science Concepts
- Big Data Storage
- Scalable Data Analysis
- Parallel Computing
- Cloud Computing
- Toward Exascale Computing
- Parallel and Distributed Machine Learning
Programming Models for Big Data:
- Parallel Programming for Big Data Applications
- The MapReduce Model
- The Workflow Model
- The Message-Passing Model
- The BSP Model
- The SQL-Like Model
- The PGAS Model
- Models for Exascale Systems
Tools for Big Data applications:
- Introduction
- MapReduce-based Programming Tools
- Workflow-based Programming Tools
- Message Passing-based Programming Tools
- BSP-based Programming Tools
- SQL-like Programming Tools
- PGAS-based Programming Tools
Comparing Programming Tools:
- Introduction
- Comparative Analysis of the System Features
- Comparative Analysis through Application Examples
Choosing the Right Framework to Tame Big Data:
- The Input Data
- The Application Class
- The Infrastructure
- Other Factors
Supplementary Material
Bibliography
Index
Readership: Undergraduate and graduate students in computer science, computer engineering, data science, and data engineering. PhD students and researchers in computer science and engineering, and data science.
Key Features:
- Helps designers and developers in programming Big Data applications by identifying and selecting the best/appropriate programming tool based on their skills, hardware availability, application domains and purposes, and also considering the support provided by the developer community
- Presents real programming examples for each programming language/framework to show how Big Data applications can be developed
Share This eBook: