Cloudera CDH简介

原创 数据分析 作者:ilsyx 时间:2017-08-30 01:40:41




CDH同级的概念是 HDP,Apache Hadoop.




到官方文档地址 可知CDHCloudera Enterprise产品中的一员。



查看Cloudera Enterprise文档的Introduction(当前5.12为最高版本)


Cloudera provides a scalable, flexible, integrated platform that makes it easy to manage rapidly increasing volumes and varieties of data in your enterprise. Cloudera products and solutions enable you to deploy and manage Apache Hadoop and related projects, manipulate and analyze your data, and keep that data secure and protected.

Cloudera provides the following products and tools:

  • CDH—The Cloudera distribution of Apache Hadoop and other related open-source projects, including Apache Impala (incubating) and Cloudera Search. CDH also provides security and integration with numerous hardware and software solutions.
  • Apache Impala (incubating)—A massively parallel processing SQL engine for interactive analytics and business intelligence. Its highly optimized architecture makes it ideally suited for traditional BI-style queries with joins, aggregations, and subqueries. It can query Hadoop data files from a variety of sources, including those produced by MapReduce jobs or loaded into Hive tables. The YARN resource management component lets Impala coexist on clusters running batch workloads concurrently with Impala SQL queries. You can manage Impala alongside other Hadoop components through the Cloudera Manager user interface, and secure its data through the Sentry authorization framework.
  • Cloudera Search—Provides near real-time access to data stored in or ingested into Hadoop and HBase. Search provides near real-time indexing, batch indexing, full-text exploration and navigated drill-down, as well as a simple, full-text interface that requires no SQL or programming skills. Fully integrated in the data-processing platform, Search uses the flexible, scalable, and robust storage system included with CDH. This eliminates the need to move large data sets across infrastructures to perform business tasks.
  • Cloudera Manager—A sophisticated application used to deploy, manage, monitor, and diagnose issues with your CDH deployments. Cloudera Manager provides the Admin Console, a web-based user interface that makes administration of your enterprise data simple and straightforward. It also includes the Cloudera Manager API, which you can use to obtain cluster health information and metrics, as well as configure Cloudera Manager.
  • Cloudera Navigator—End-to-end data management and security for the CDH platform. Cloudera Navigator Data Management enables administrators, data managers, and analysts explore vast data collections in Hadoop. Cloudera Navigator Encrypt and simplifies the storage and management of encryption keys. The robust auditing, data management, lineage management, lifecycle management, and encryption key management in Cloudera Navigator allow enterprises to adhere to stringent compliance and regulatory requirements.


看完说明后,大体了解到Cloudera提供如下产品和工具:CDH,Apache Impala,Cloudera Search,Cloudera Manager,Cloudera Navigator .  其中CDH包含Apache ImpalaCloudera Search. 总结起来,Cloudera提供CDH,Cloudera Manager,Cloudera Navigator三大件.



CDH Overview

CDH delivers the core elements of Hadoop



Cloudera Manager 5 Overview

With Cloudera Manager, you can easily deploy and centrally operate the complete CDH stack and other managed services.








Cloudera Navigator Data Management Overview

Cloudera Navigator Data Management is a complete solution for data governance, auditing, and related data management tasks that is fully integrated with the Hadoop platform.


Is Cloudera Navigator a module of Cloudera Manager?

Not exactly. Cloudera Navigator is installed separately, after Cloudera Manager is installed, and it interacts behind the scenes with Cloudera Manager to deliver some of its core functionality. Cloudera Manager is used by cluster administrators to manage the cluster and all its services. Cloudera Navigator is used by administrators but also by security and governance teams, data stewards, and others to audit, trace data lineage from source raw data through final form, and perform other comprehensive data governance and stewardship tasks.



如果不涉及到数据安全审计等方面,Cloudera Navigator可以不用安装。





