阅读背景:

Cloudera Manager 5 and CDH 5 installation

来源:互联网 

We are using Cloudera Manager 5.1.x and CDH 5.1.x to deploy our own hadoop cluster.


1. Requirements ——Preparing the OS

https://www.cloudera.com/content/support/en/downloads/cloudera_manager/cm-5-1-1.html

1.1 Check the OS Requirement 


Using the CentOS 6.4, 64-bit for our cluster.

1.2 Supported JDK Versions

Cloudera Manager supports Oracle JDK 7u45 and Oracle JDK 6u31, and will also install them. Additionally, Cloudera Manager supports JDK1.7u25.

1.3 Supported Databases 

PostgreSQL is used in this case.


2. OS Configuration

Make sure all the PCs in cluster have the same root password

2.1 Trun off the internet connection firewall

1) Temporary

ON :  service iptables start

OFF: service iptables stop

2) Perpetual

ON :  chkconfig iptables on

OFF: chkconfig iptables off

3) Check the firewall status

service iptables status

2.2 Config the proxy

use command “vim /etc/yum.conf” and then add attributes:

http_proxy=https://server:port/

timeout=55555

2.3 Turn of SELinux

Modify the file /etc/selinux/config:

SeLINUX = disabled


or use terminal command:

# sudo setenforce 0

2.4 Install PostgreSQL

In terminal

1) download and install PostgreSQL

# yum install postgresql

# yum install postgresql-server

2) initialize database

# service postgresql initdb

# service postgresql start

3) set the service start when the OS start

# chkconfig postgresql on

4) config database

Modify the file /var/lib/pgsql/data/postgresql.conf (delete “#” first):

listen_addresses= '*'    #what IP address(es) to listen on;

                                     # comma-separated listof addresses;

                                     # defaults to'localhost', '*' = all

port= 5432

5) restart

# service postgresql restart


Now reboot the system to make all the configurations become effective.


3. CDH cluster configuration

3.1 The environment of the cluster

Use 5 PCs to deploy the hadoop cluster. (The cluster need at least 1 namenode and 4 datanodes)

All PCs are CentOS 6.4 64-bit. In this case, the cluster has 1 master and 4 slaves. Each node use static IP and could connect to each other.

PC

host name

10.211.55.6

Master

10.211.55.7

Slave01

10.211.55.8

Slave02

10.211.55.9

Slave03

10.211.55.10

Slave04


 

3.2 Configuration

1) Change host name (each of the PC)

# vim /etc/sysconfig/network

HOSTNAME = Master

and so on (Slave01, Slave02, Slave03, Slave04)

note: need reboot the system

2) Modify the hosts file

# vim /etc/hosts 

add:

10.211.55.6 Master

10.211.55.7 Slave01

10.211.55.8 Slave02

10.211.55.9 Slave03

10.211.55.10 Slave04

note: all the PCs are config like this, could use scp command to copy the hosts file to other PC.

3) Config SSH login without password 

 Set Master login Slaves without password

Generate key pairs on Master:

# ssh-keygen -t rsa -p ‘’      (saved in /root/.ssh path)

Add id_rsa.pub to authorized keys:

# cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

Change authority of the authorized_keys:

# chmod 600 ~/.ssh/authorized_keys

Modify SSH config file:

# vim /etc/ssh/sshd_config

RSAAuthenticationyes

PubkeyAuthenticationyes

AuthorizedKeysFile.ssh/authorized_keys

(delete the ‘#’ in the first place of that line)

Restart SSH service:

# service sshd restart

Check work or not:

# ssh Master

Copy public key to all the Slaves. For example as Slave01:

# scp !/.ssh/id_rsa.pub [email protected]:~/

Login Slave01:

# ssh 10.211.55.7

Input the password of the root, create .ssh folder at the root path (if already have could pass this step):

# mkdir ~/.ssh

Change authority of the “.ssh”:

# chmod 700 ~/.ssh

Add id_rsa.pub to authorized keys:

# cat ~/id_rsa.pub >> ~/.ssh/authorized_keys

Change authority of the authorized_keys:

# chmod 600 ~/.ssh/authorized_keys

Modify SSH config file on Slave01:

# vim /etc/ssh/sshd_config

RSAAuthenticationyes

PubkeyAuthenticationyes

AuthorizedKeysFile.ssh/authorized_keys

(delete the ‘#’ in the first place of that line)

Restart SSH service:

# service sshd restart

Delete id_rsa.pub which copy form Master:

# rm ~/id_rsa.pub

Check if success, go back to Master first:

# ssh 10.211.55.6 (still need password because we just set Master login Slave01)

At master, login slave01:

# ssh Slave01 (if success, should not input password)


note: other slaves are same.

 Set Slaves login Master without password (could use master login slave to config)

For example as Slave01, generate key pairs on Slave01:

# ssh-keygen -t rsa -P ‘’

Add id_rsa.pub to authorized keys:

# cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

Restart SSH service:

# service sshd restart

Check work or not:

# ssh Slave01

Copy public key to the Master:

# scp !/.ssh/id_rsa.pub [email protected]:~/

Login Master:

# ssh 10.211.55.6 (still need password, not finish configuration)

Change authority of the “.ssh” folder:

# chmod 700 ~/.ssh

Add id_rsa.pub to authorized keys:

# cat ~/id_rsa.pub >> ~/.ssh/authorized_keys

Delete id_rsa.pub which copy form Slave01:

# rm ~/id_rsa.pub

Back from Master to Slave01:

# ssh Slave01 (do not need password, already set ready)

Check if success, at Slave01, login Master:

# ssh Master


4. Install Cloudera Manager and CDH

4.1 Install Cloudera Manager packages from a local repository

1) Install a Web Server

Check the https service:

# service httpd status

If not exist, install httpd:

# yum install httpd

set the https service start when the OS start:

# chkconfig httpd on

2) Download Tarball and Publish Repository Files

Go to https://archive-primary.cloudera.com/cm5/repo-as-tarball/ to download repo resources.

Unpack the tarball, move the files to the web server directory, and modify file permissions.

# gunzip cm5.0.0-centos6.tar.gz 

# tar xvf cm5.0.0-centos6.tar 

# mv cm /var/www/html

# chmod -R ugo+rX /var/www/html/cm

Visit https://<hostname>:80/cm to verify that you see an index of files.

3) Modify Clients to Find Repository

Create file “myrepo.repo” and insert:


[myrepo]

name=myrepo

baseurl=https://hostname/cm/5

enabled=1

gpgcheck=0


Put this file into /etc/yum.repos.d/myrepo.repo. (sometimes need to remove other .repo file in that folder)

4) Download the Cloudera Manager installer binary fromhttps://www.cloudera.com/content/support/en/downloads.html. Downloads to the cluster host where you want to install the  Cloudera Manager Server.

5) Change cloudera-manager-installer.bin to have executable permission.

# chmod u+x cloudera-manager-installer.bin

6) Run the Cloudera Manager Server installer:

# ./cloudera-manager-installer.bin --skip_repo_package=1

7) Press Return or Enter to choose OK to exit the installer. Show as figure 4.1.1


8) Visit https://localhost:7180 to continue install CDH.

The username and password are both “admin”


 


4.2 Install CDH

1) Choose which edition to install:

Cloudera Express, which does not require a license, but provides a somewhat limited set of features.

2) Information is displayed indicating what the CDH installation includes. At this point, you can access online Help or the Support Portal if you wish. Click Continue to proceed with the installation.

3) To enable Cloudera Manager to automatically discover hosts on which to install CDH and managed services, enter the cluster hostnames or IP addresses.  


4) Click Search. Cloudera Manager identifies the hosts on your cluster to allow you to configure them for services.


5) Choose Software Installation Method and Install Software. Select the repository type to use for the installation: parcels or packages. ——Use Parcels.

6) If your local laws permit you to deploy unlimited strength encryption and you are running a secure cluster, check the Install Java Unlimited Strength Encryption Policy Files checkbox. Click Continue. Or just skip it. 


7) Specify SSH login properties: Select root or enter the user name for an account that has password-less sudo permission. Select an authentication method: If you choose to use password authentication, enter and confirm the password. If you choose to use public-key authentication provide a passphrase and path to the required key files. You can choose to specify an alternate SSH port. The default value is 22. You can specify the maximum number of host installations to run at once. The default value is 10.

In this case, just select root and input the password of all the PCs. (All the PCs’ root password are same)

8) Click Continue. Cloudera Manager performs the following: installs the Oracle JDK and the Cloudera Manager Agent packages and starts the Agent. Click Continue. During the parcel installation, progress is indicated for the two phases of the parcel installation process (Download and Distribution) in a separate progress bars. If you are installing multiple parcels you will see progress bars for each parcel. When the Continue button appears at the bottom of the screen, the installation process is completed. 


9) Click Continue. The Host Inspector runs to validate the installation, and provides a summary of what it finds, including all the versions of the installed components. If the validation is successful, click Finish. The Cluster Setup page displays.

There maybe some warnings and change the configuration it mentioned

10)  Add Services:

Choose All Services - HDFS, YARN (includes MapReduce 2), ZooKeeper, Oozie, Hive, Hue, Sqoop, HBase, Impala, Solr, Spark, and Key-Value Store Indexer.

11) Customize the assignment of role instances to hosts. 

The wizard evaluates the hardware configurations of the hosts to determine the best hosts for each role. The wizard assigns all worker roles to the same set of hosts to which the HDFS DataNode role is assigned. These assignments are typically acceptable, but you can reassign role instances to hosts of your choosing, if desired. Click a field below a role to display a dialog containing a pageable list of hosts. If you click a field containing multiple hosts, you can also select All Hosts to assign the role to all hosts or Custom to display the pageable hosts dialog. 


12) When you are satisfied with the assignments, click Continue. The Database Setup page displays. Leave the default setting of Use Embedded Database to have Cloudera Manager create and configure required databases. Make a note of the auto-generated passwords.

Or Select Use Custom Databases to specify external databases. Enter the database host, database type, database name, username, and password for the database that you created when you set up the database.

13) Review the configuration changes to be applied. Confirm the settings entered for file system paths. The file paths required vary based on the services to be installed.

14) Click Finish to proceed to the Home Page. 


All the settings of the cluster must be changed using the Cloudera Manager.













分享到: