CDP-DC Installation

Installation of CDP-DC is similar to installation of CDH, I will be providing the steps to do the easy installation of CDP-DC.

Steps

Node Preperation
CM Installation

Pre-requisite

links for CM and CDP parcels
JDK and Node Pre-requisites

Node Preparation

I am going to do installation on 6 nodes, below are the different catagories of nodes, I am going to use.

Type	Count
Management Node	1
Master Node	2
Slave Nodes	3
Gateway Node	1

Management Node

Consists of Mangement services like host/event/
Cloudera Manager will be installed on it
Admins will have access to this node for management

Master Nodes

Master services will be installed on this node
Hdfs/Yarn/Hbase Master
It will be memory and code rich node

Slave Nodes

Slave services like DataNode, NodeManagers zookeepers etc
they should be high on storage side
Average memory/CPU cores

Gateway Node

It will be acting as edge node or client node
All the clients will be installed on this node
Users will be accessing the cluster from this node.

CM / CDP Installation

Now we have 6 nodes available, We need to get the trial version of CM so we need to download from Here.

# running with root on node-1
$ wget https://archive.cloudera.com/cm7/7.1.3/cloudera-manager-installer.bin
$ chmod u+x cloudera-manager-installer.bin
$ ./cloudera-manager-installer.bin

You will see the blue screen with command to press next and next ins-1

It will install the CM and JDK on the node 1, Once its done you can open the node-1:7180/cmf and login with admin/admin. After login select the second option with Try Cloudera Data Platform for 60 days.

ins-2

Next Page will need the hostname list, You can provide the host list like c2110-node[1-6].hortonworks.com. Now just select the version we want to install on the cluster.

ins-3

Select the Cloudera provided JDK and Select Next after that you need provide the Password or Private key of root user. Once you provide details it will start installing agents and CDP parcels.

ins-4

After Installation and activation of parcels, Next sceen will ask you validate the hosts and network

Inspect Network Performance
Inspect Hosts
I understand the risks of not running the inspections or the detected issues, let me continue with cluster setup.

Click Next. Here you can see the multiple choices, Cloudera came up with multiple templates w.r.t use cases, now you can directly install the service according to required use case or you can install with customization.

Now after selecting next if you want to install ranger / hive / rangerkms / oozie / hue / das any of these. You need to set the database first. I have used mysql to install on node-3, You can see the details here

$ wget http://repo.mysql.com/mysql-community-release-el7-5.noarch.rpm
$ rpm -ivh mysql-community-release-el7-5.noarch.rpm
$ yum update
$ yum install mysql-server
$ systemctl start mysqld
$ systemctl enable mysqld

# donwload the jar and place under /usr/share/java on CM node-1
$ wget https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.46.tar.gz
$ tar zxvf mysql-connector-java-5.1.46.tar.gz
$ mkdir -p /usr/share/java/
$ cd mysql-connector-java-5.1.46
$ cp mysql-connector-java-5.1.46-bin.jar /usr/share/java/mysql-connector-java.jar
$ chmod +x mysql-connector-java.jar

Now you need to create user and databases like below, you can choose your desired name and password. You can also check the list here

CREATE DATABASE ranger DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON ranger.* TO 'ranger'@'%' IDENTIFIED BY 'hadoop';
GRANT ALL ON ranger.* TO 'ranger'@'localhost' IDENTIFIED BY 'hadoop';

CREATE DATABASE rangerkms DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON rangerkms.* TO 'rangerkms'@'%' IDENTIFIED BY 'hadoop';
GRANT ALL ON rangerkms.* TO 'rangerkms'@'localhost' IDENTIFIED BY 'hadoop';

CREATE DATABASE hue DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON hue.* TO 'hue'@'%' IDENTIFIED BY 'hadoop';
GRANT ALL ON hue.* TO 'hue'@'localhost' IDENTIFIED BY 'hadoop';

CREATE DATABASE metastore DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON metastore.* TO 'hive'@'%' IDENTIFIED BY 'hadoop';
GRANT ALL ON metastore.* TO 'hive'@'localhost' IDENTIFIED BY 'hadoop';

CREATE DATABASE rman DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON rman.* TO 'rman'@'%' IDENTIFIED BY 'hadoop';
GRANT ALL ON rman.* TO 'rman'@'localhost' IDENTIFIED BY 'hadoop';

Once this done you can provide the Server details, like which services will be installed on which node. After the service selection on the node click on Continue. You will be on database page where you need to provide the database details like below.

After This, You will land onto configuration page where we need to provide the directories for services, I have used below it will be created automatically.

Property	Path
dfs.datanode.data.dir	/data/datanode
dfs.namenode.name.dir	/data/namenode
dfs.namenode.checkpoint.dir	/data/snamenode
yarn.nodemanager.local-dirs	/node/local

Click Next, It will start installing and running commands in background once its completed click on Finish. Now you will be redirected to Home page of cloudera

ins-7

After Installation

There are few steps you need to perform after installation of services.

Install yarn Jars/Dependencies and HS directories on Hdfs
Root Directory for Hbase
Iniitalize the solr
Create hive userdir and Warehouse directories
Enable HA for HDFS and YARN
Migration of CM database from embadded to another

Install yarn Jars/Dependencies

You need to click in yarn services

Cluster -> Yarn -> Actions -> Install YARN MapReduce Framework Jars
Cluster -> Yarn -> Actions -> Install YARN Service Dependencies
Cluster -> Yarn -> Actions -> Create Job History Dir
Cluster -> Yarn -> Actions -> Create NodeManager Remote Application Log Directory

Root Directory for Hbase

Create the root directory on hdfs for Hbase

Cluster -> Hbase -> Actions -> Create Root Directory

Initalize the solr

Before starting the solr you need to initialise it else it will not start

Cluster -> CDP-INFRA-SOLR -> Actions -> Initialize Solr

Create hive userdir and Warehouse directories

Creating the hive user directory and warehouse directory for managed and external tables

Cluster -> Hive -> Actions -> Create Hive User Directory
Cluster -> Hive -> Actions -> Create Hive Warehouse Directory
Cluster -> Hive -> Actions -> Create Hive Warehouse External Directory

Enable HA for HDFS

Cluster -> Hdfs -> Actions -> Enable High Availability

you need to provide the name service and select the second host along with journal nodes

nameservice       : cdpdcservice
Standby Namenode  : c2110-node-1
Journalnode       : c2110-node-[3-5] [same as datanode]
Journal_node_path : /data/journalnode

For each of the Hive service(s) Hive, stop the Hive service, back up the Hive Metastore Database to a persistent store, run the service command Update Hive Metastore NameNodes, then restart the Hive services.

Cluster -> Hive -> Actions -> Stop
Cluster -> Hive -> Actions -> Update Hive Metastore NameNodes
Cluster -> Hive -> Actions -> Start

Enable HA for YARN

Cluster -> Yarn -> Actions -> Enable High Availability

Here you need to select the node where you want to install the standby ResourceManager, As I selected the c2110-node2.

CDP-DC Installation

Steps

Pre-requisite

Node Preparation

CM / CDP Installation

After Installation

Recent Update

Trending Tags

Contents

Trending Tags

CDP-DC Installation

Steps

Pre-requisite

Node Preparation

CM / CDP Installation

After Installation

Recent Update

Trending Tags

Contents

Further Reading

CDP-DC Introduction

CDP-DC Security Implementation

CDP Public Cloud Introduction

Trending Tags