Big Data: Principles and Paradigms captures the state-of-the-art research on the architectural aspects, technologies, and applications of Big Data. The book identifies potential future directions and technologies that facilitate insight into numerous scientific, business, and consumer applications.
To help realize Big Data’s full potential, the book addresses numerous challenges, offering the conceptual and technological solutions for tackling them. These challenges include life-cycle data management, large-scale storage, flexible processing infrastructure, data modeling, scalable machine learning, data analysis algorithms, sampling techniques, and privacy and ethical issues.
Key Features
- Covers computational platforms supporting Big Data applications
- Addresses key principles underlying Big Data computing
- Examines key developments supporting next generation Big Data platforms
- Explores the challenges in Big Data computing and ways to overcome them
- Contains expert contributors from both academia and industry
- List of contributors
- About the Editors
- Preface
- Organization of the Book
- Part I: Big Data Science
- Part II: Big Data Infrastructures and Platforms
- Part III: Big Data Security and Privacy
- Part IV: Big Data Applications
- Acknowledgments
- Part I: Big Data Science
- Chapter 1: BDA = ML + CC
- Abstract
- 1.1 Introduction
- 1.2 A Historical Review of Big Data
- 1.3 Historical Interpretation of Big Data
- 1.4 Defining Big Data From 3Vs to 32Vs
- 1.5 Big Data Analytics and Machine Learning
- 1.6 Big Data Analytics and Cloud Computing
- 1.7 Hadoop, HDFS, MapReduce, Spark, and Flink
- 1.8 ML + CC → BDA and Guidelines
- 1.9 Conclusion
- Chapter 2: Real-Time Analytics
- Abstract
- 2.1 Introduction
- 2.2 Computing Abstractions for Real-Time Analytics
- 2.3 Characteristics of Real-Time Systems
- 2.4 Real-Time Processing for Big Data — Concepts and Platforms
- 2.5 Data Stream Processing Platforms
- 2.6 Data Stream Analytics Platforms
- 2.7 Data Analysis and Analytic Techniques
- 2.8 Finance Domain Requirements and a Case Study
- 2.9 Future Research Challenges
- Chapter 3: Big Data Analytics for Social Media
- Abstract
- Acknowledgments
- 3.1 Introduction
- 3.2 NLP and Its Applications
- 3.3 Text Mining
- 3.4 Anomaly Detection
- Chapter 4: Deep Learning and Its Parallelization
- Abstract
- 4.1 Introduction
- 4.2 Concepts and Categories of Deep Learning
- 4.3 Parallel Optimization for Deep Learning
- 4.4 Discussions
- Chapter 5: Characterization and Traversal of Large Real-World Networks
- Abstract
- Acknowledgments
- 5.1 Introduction
- 5.2 Background
- 5.3 Characterization and Measurement
- 5.4 Efficient Complex Network Traversal
- 5.5 k-Core-Based Partitioning for Heterogeneous Graph Processing
- 5.6 Future Directions
- 5.7 Conclusions
- Chapter 1: BDA = ML + CC
- Part II: Big Data Infrastructures and Platforms
- Chapter 6: Database Techniques for Big Data
- Abstract
- 6.1 Introduction
- 6.2 Background
- 6.3 NoSQL Movement
- 6.4 NoSQL Solutions for Big Data Management
- 6.5 NoSQL Data Models
- 6.6 Future Directions
- 6.7 Conclusions
- Chapter 7: Resource Management in Big Data Processing Systems
- Abstract
- 7.1 Introduction
- 7.2 Types of Resource Management
- 7.3 Big Data Processing Systems and Platforms
- 7.4 Single-Resource Management in the Cloud
- 7.5 Multiresource Management in the Cloud
- 7.6 Related Work on Resource Management
- 7.7 Open Problems
- 7.8 Summary
- Chapter 8: Local Resource Consumption Shaping: A Case for MapReduce
- Abstract
- 8.1 Introduction
- 8.2 Motivation
- 8.3 Local Resource Shaper
- 8.4 Evaluation
- 8.5 Related Work
- 8.6 Conclusions
- Appendix CPU Utilization With Different Slot Configurations and LRS
- Chapter 9: System Optimization for Big Data Processing
- Abstract
- 9.1 Introduction
- 9.2 Basic Framework of the Hadoop Ecosystem
- 9.3 Parallel Computation Framework: MapReduce
- 9.4 Job Scheduling of Hadoop
- 9.5 Performance Optimization of HDFS
- 9.6 Performance Optimization of HBase
- 9.7 Performance Enhancement of Hadoop System
- 9.8 Conclusions and Future Directions
- Chapter 10: Packing Algorithms for Big Data Replay on Multicore
- Abstract
- 10.1 Introduction
- 10.2 Performance Bottlenecks
- 10.3 The Big Data Replay Method
- 10.4 Packing Algorithms
- 10.5 Performance Analysis
- 10.6 Summary and Future Directions
- Chapter 6: Database Techniques for Big Data
- Part III: Big Data Security and Privacy
- Chapter 11: Spatial Privacy Challenges in Social Networks
- Abstract
- Acknowledgments
- 11.1 Introduction
- 11.2 Background
- 11.3 Spatial Aspects of Social Networks
- 11.4 Cloud-Based Big Data Infrastructure
- 11.5 Spatial Privacy Case Studies
- 11.6 Conclusions
- Chapter 12: Security and Privacy in Big Data
- Abstract
- 12.1 Introduction
- 12.2 Secure Queries Over Encrypted Big Data
- 12.3 Other Big Data Security
- 12.4 Privacy on Correlated Big Data
- 12.5 Future Directions
- 12.6 Conclusions
- Chapter 13: Location Inferring in Internet of Things and Big Data
- Abstract
- Acknowledgements
- 13.1 Introduction
- 13.2 Device-based Sensing Using Big Data
- 13.3 Device-free Sensing Using Big Data
- 13.4 Conclusion
- Chapter 11: Spatial Privacy Challenges in Social Networks
- Part IV: Big Data Applications
- Chapter 14: A Framework for Mining Thai Public Opinions
- Abstract
- Acknowledgments
- 14.1 Introduction
- 14.2 XDOM
- 14.3 Implementation
- 14.4 Validation
- 14.5 Case Studies
- 14.6 Summary and Conclusions
- Chapter 15: A Case Study in Big Data Analytics: Exploring Twitter Sentiment Analysis and the Weather
- Abstract
- Acknowledgments
- 15.1 Background
- 15.2 Big Data System Components
- 15.3 Machine-Learning Methodology
- 15.4 System Implementation
- 15.5 Key Findings
- 15.6 Summary and Conclusions
- Chapter 16: Dynamic Uncertainty-Based Analytics for Caching Performance Improvements in Mobile Broadband Wireless Networks
- Abstract
- 16.1 Introduction
- 16.2 Background
- 16.3 Related Work
- 16.4 VoD Architecture
- 16.5 Overview
- 16.6 Data Generation
- 16.7 Edge and Core Components
- 16.8 INCA Caching Algorithm
- 16.9 QoE Estimation
- 16.10 Theoretical Framework
- 16.11 Experiments and Results
- 16.12 Synthetic Dataset
- 16.13 Conclusions and Future Directions
- Chapter 17: Big Data Analytics on a Smart Grid: Mining PMU Data for Event and Anomaly Detection
- Abstract
- Acknowledgments
- 17.1 Introduction
- 17.2 Smart Grid With PMUs and PDCs
- 17.3 Improving Traditional Workflow
- 17.4 Characterizing Normal Operation
- 17.5 Identifying Unusual Phenomena
- 17.6 Identifying Known Events
- 17.7 Related Efforts
- 17.8 Conclusion and Future Directions
- Chapter 18: eScience and Big Data Workflows in Clouds: A Taxonomy and Survey
- Abstract
- 18.1 Introduction
- 18.2 Background
- 18.3 Taxonomy and Review of eScience Services in the Cloud
- 18.4 Resource Provisioning for eScience Workflows in Clouds
- 18.5 Open Problems
- 18.6 Summary
- Chapter 14: A Framework for Mining Thai Public Opinions
- Index
- Liu, Computational and Statistical Methods for Analysing Big Data with Applications, Oct 2015, 184 pages, 9780128037324, $99.95
- Talburt, Entity Information Life Cycle for Big Data, Apr 2015, 236 pages, 9780128005378, $64.95
- Berman, Principles of Big Data, May 2013, 262 pages, 9780124045767, $64.95
- Loshin, Big Data Analytics, Aug 2013, 142 pages, 9780124173194, $29.95
- Krishnan, Data Warehousing in the Age of Big Data, May 2013, 346 pages, 9780124058910, $45.95
Data Scientists, Data Architects, DevOps Engineers, Cloud developers and more. Graduate Data Science students and other academic researchers
Reviews
Ein gutes und lesenswertes Buch.