Use this practical guide to successfully handle the challenges
encountered when designing an enterprise data lake and learn industry
best practices to resolve issues.
When designing an enterprise data lake you often hit a roadblock when
you must leave the comfort of the relational world and learn the nuances
of handling non-relational data. Starting from sourcing data into the
Hadoop ecosystem, you will go through stages that can bring up tough
questions such as data processing, data querying, and security. Concepts
such as change data capture and data streaming are covered. The book
takes an end-to-end solution approach in a data lake environment that
includes data security, high availability, data processing, data
streaming, and more.
Each chapter includes application of a concept, code snippets, and use
case demonstrations to provide you with a practical approach. You will
learn the concept, scope, application, and starting point.
What You'll Learn
- Get to know data lake architecture and design principles
- Implement data capture and streaming strategies
- Implement data processing strategies in Hadoop
- Understand the data lake security framework and availability model
Who This Book Is For
Big data architects and solution architects