更新时间:2021-07-02 12:50:36
coverpage
Title Page
Copyright and Credits
Mastering Hadoop 3
Dedication
About Packt
Why subscribe?
Packt.com
Foreword
Contributors
About the authors
About the reviewer
Packt is searching for authors like you
Preface
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Download the color images
Code in action
Conventions used
Get in touch
Reviews
Section 1: Introduction to Hadoop 3
Journey to Hadoop 3
Hadoop origins and Timelines
Origins
MapReduce origin
Timelines
Overview of Hadoop 3 and its features
Hadoop logical view
Hadoop distributions
On-premise distribution
Cloud distributions
Points to remember
Summary
Deep Dive into the Hadoop Distributed File System
Technical requirements
Defining HDFS
Deep dive into the HDFS architecture
HDFS logical architecture
Concepts of the data group
Blocks
Replication
HDFS communication architecture
NameNode internals
Data locality and rack awareness
DataNode internals
Quorum Journal Manager (QJM)
HDFS high availability in Hadoop 3.x
Data management
Metadata management
Checkpoint using a secondary NameNode
Data integrity
HDFS Snapshots
Data rebalancing
Best practices for using balancer
HDFS reads and writes
Write workflows
Read workflows
Short circuit reads
Managing disk-skewed data in Hadoop 3.x
Lazy persist writes in HDFS
Erasure encoding in Hadoop 3.x
Advantages of erasure coding
Disadvantages of erasure coding
HDFS common interfaces
HDFS read
HDFS write
HDFSFileSystemWrite.java
HDFS delete
HDFS command reference
File System commands
Distributed copy
Admin commands
YARN Resource Management in Hadoop
Architecture
Resource Manager component
Node manager core
Introduction to YARN job scheduling
FIFO scheduler
Capacity scheduler
Configuring capacity scheduler
Fair scheduler
Scheduling queues
Configuring fair scheduler
Resource Manager high availability
Architecture of RM high availability
Configuring Resource Manager high availability
Node labels
Configuring node labels
YARN Timeline server in Hadoop 3.x
Configuring YARN Timeline server
Opportunistic containers in Hadoop 3.x
Configuring opportunist container
Docker containers in YARN
Configuring Docker containers
Running the Docker image