This post describes the process that I followed to install Hortontworks HDP 3.1.0 on a cluster with three VMWare virtual Hosts. This process followes three major steps: 1) set up the cluster environemnt; 2) set up a local repository for both Ambari and HDP stacks; 3) install HDP stacks through Ambari server. You can follow this process to install other versions, you can check product versions through Hortonworks support matrix: https://supportmatrix.hortonworks.com/
Table of Content
- Table of Content
- Virtual Nodes Info
- Prepare the Cluster Environment
- Set up a Local Repository for Ambari and HDP Stack
- Installing Ambari Server and Agent
- Install, Configure and Deploy the Cluster
Virtual Nodes Info
First thing I did is set up three virtual hosts in VMWare with following info:
|Host Name||Host IP||Configuration|
|hadoop-master||146.xxx.xxx.75||4 x cpu, 16 GB Ram, 100 GB disk|
|hadoop-node-1||146.xxx.xxx.76||4 x cpu, 16 GB Ram, 100 GB disk|
|hadoop-node-2||146.xxx.xxx.77||4 x cpu, 16 GB Ram, 100 GB disk|
for each host,
yum has been setup.
Prepare the Cluster Environment
0. Setting Proxy
If you’re behind proxy, then many repositories, including
yum repository, are accessed through proxy servers. For each host, you can set up proxy server info through adding
export https_proxy=https://<your.https-proxy.address>:<port#> export http_proxy=http://<your.http-proxy.address>:<port#>
/root/.bashrc. And run
source .bashrc to refresh.
1. Make sure the run level is multi-user text mode.
Run following command to check the run level:
It is expected to see the response of
multi-user.target. If not, run following command to change:
systemctl set-default multi-user.target
2. Check and set hostnames.
- For each host in the cluster, confirm that the hostname is set to a
FQDNname by running following command:
This should return a a fully qualified domain name (FQDN), which has a format like
You can use
hostname command to set the hostname on each host in the cluster. For example:
- Edit Network Configuration File. For each host in the cluster, open the network configuration file through
Modify the HOSTNAME property to FQDN:
- Edit the hosts file. Open the hosts file on each host in the cluster through
vi /etc/hostsadd a line to each file. For example:
146.xxx.xxx.75 hadoop-master.qualified.doman.name hadoop-master 146.xxx.xxx.76 hadoop-master.qualified.doman.name hadoop-node-1 146.xxx.xxx.77 hadoop-master.qualified.doman.name hadoop-node-2
After made these changes. It needs reboot through running
3. Set up password-less SSH:
- Login to the
rootuser and generate SSH keys using
ssh-keygen -t rsa. Press enter for all prompts and accept all default values.
- Run following command to copy ssh identification for localhost. Enter password when prompted for the password:
Then run command
ssh hadoop-master to make sure no password needed.
For each of other hosts in the cluster:
- Copy the SSH file from
hadoop-masterto every other hosts in the cluster, for example:
scp -pr /root/.ssh firstname.lastname@example.org:/root/
- Upload the generated
root’s .ssh directory as a file with name authorized_keys, for example:
cat .ssh/id_rsa.pub | ssh email@example.com 'cat >> .ssh/authorized_keys'
- Set permissions for .ssh directory and authorized_keys file:
ssh firstname.lastname@example.org; chmod 700 .ssh; chmod 640 .ssh/authorized_keys
hadoop-master host, run following commands in sequence, to make sure inter-node connection using SSH without password:
ssh hadoop-node-1 ssh hadoop-node-2 ssh hadoop-master ssh hadoop-node-2 ssh hadoop-node-1 ssh hadoop-master
4. Enable NTP
Run following commands on each host to install and enable NTP service:
yum install -y ntp systemctl enable ntpd systemctl start ntpd
After that run
timedatectl status and look for following lines to verfiy that NTP is running
NTP enabled: yes NTP synchronized: yes
- stop ntp serivce:
systemctl stop ntpd
server your.ntp.server.addressto the
/etc/ntp.conf‘s servers part.
- Force time synchronize:
- Restart ntp:
systemctl start ntpd
systemctl enable ntpdateto make sure running the
ntpdateat boot time.
5. Configuring Firewall
Run following commands to disable firewall on each host in the cluster:
systemctl disable firewalld service firewalld stop
systemctl status firewalld to make sure firewall is disabled.
6. Disable SElinux
For each host in the cluster, change SELINUX value from enhancing to disabled in
wget on all nodes:
yum install -y wget
Set up a Local Repository for Ambari and HDP Stack
1. Create and start an HTTP sever on the master host:
yum install -y httpd service httpd restart chkconfig httpd on
Make sure there is the repository
/var/www/htmlhas been created on the host.
2. Set up the local repository
- Download the tarball files for Ambari and HDP stacks through following commands:
wget http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/220.127.116.11/ambari-18.104.22.168-centos7.tar.gz wget http://public-repo-1.hortonworks.com/HDP/centos7/3.x/updates/22.214.171.124/HDP-126.96.36.199-centos7-rpm.tar.gz wget http://public-repo-1.hortonworks.com/HDP-UTILS-188.8.131.52/repos/centos7/HDP-UTILS-184.108.40.206-centos7.tar.gz wget http://public-repo-1.hortonworks.com/HDP-GPL/centos7/3.x/updates/220.127.116.11/HDP-GPL-18.104.22.168-centos7-gpl.tar.gz
- Untar and copy the files to
/var/www/html/. For example,
tar zxvf ambari-22.214.171.124-centos7.tar.gz -C /var/www/html/Then record the base URL’s which are needed for installing the cluster:
Ambari: http://146.xxx.xxx.75/ambari/centos7/126.96.36.199-139/ HDP: http://146.xxx.xxx.75/HDP/centos7/188.8.131.52-78/ HDP-GPL: http://146.xxx.xxx.75/HDP-GPL/centos7/184.108.40.206-78/ HDP-UTILS: http://146.xxx.xxx.75/HDP-UTILS/centos7/220.127.116.11/
- make sure you can browser in the web browser;
- The path where you can see the
Installing Ambari Server and Agent
1. Download Ambari Repository
- Login to the
- Check the repository URL from Ambari Repository Links
- Download Ambari repository file to the directory
/etc/yum.repos.d/, through following command:
wget -nv http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/18.104.22.168/ambari.repo -O /etc/yum.repos.d/ambari.repo
- Edit the
ambari.repofile and change the
gpgkeyto the local repository obtained above.
yum repolistto confirm that the repository has been configured successfully. You should see
ambari-22.214.171.124-xxxon the list. See Download Ambari Repository for more information.
2. Install Ambari Server
Install the Ambari server on the master node through command:
yum install -y ambari-server
See Install Ambari Server for more information.
3. Set up Ambari Server
-Dhttp.proxyHost=<yourProxyHost> -Dhttp.proxyPort=<yourProxyPort> -Dhttps.proxyHost=<yourProxyHost> -Dhttps.proxyPort=<yourProxyPort>
in the file
Run following command on the Ambari server host to start the setup process:
See Set Up Ambari Server for more information.
Install, Configure and Deploy the Cluster
1. Start the Ambari Server
After the server starts successfully, you can login to the server with default user/name
2. Installing HDP through Installation Wizard
Follow the steps of the Wizard intall HDP:
Step 0 Get Started: give a name to your cluster, for example,
Step 1 Select Version: Select
Use Local Repository. Delete all other OS, leave readhat7 only. Copy the Base URL to the places.
Step 2 Install Options:
Step 3 Confirm Hosts: it will automatically do the regiestration with the setting in Step 2.
Step 4 Choose Services: choose basic ones, you can add more later.
Step 5 Assign Masters: keep default
Step 6: Select all for the
Error 1: emtpty HDP Url: https://community.hortonworks.com/articles/231020/ambari-273-ambari-writes-empty-baseurl-values-writ.html
Error 2: Requires: libtirpc-devel: https://community.hortonworks.com/idea/107386/libtirpc-devel-required.html Run following commands on all hosts:
subscription-manager repos --enable=rhel-7-server-optional-rpms yum install -y libtirpc-devel
Error 3 Hive install failed because of mysql-connector-java.jar due to HTTP error: HTTP Error 404: Not Found https://community.hortonworks.com/articles/170133/hive-start-failed-because-of-ambari-error-mysql-co.html Run following commands on Ambari server:
yum install -y mysql-connector-java ls -al /usr/share/java/mysql-connector-java.jar cd /var/lib/ambari-server/resources/ ln -s /usr/share/java/mysql-connector-java.jar mysql-connector-java.jar
Error 4 Empty Baseurl for Public Repository (No solution, might be proxy issue): https://community.hortonworks.com/questions/45147/ambari-setup-select-stack-404-error.html https://community.hortonworks.com/questions/35820/hdp-stack-repositories-not-found.html
Error 5 Ambari Files View - Service hdfs check failed: Solved: In order to fix it you should try creating a new “File View” instance by clicking on the “Create Instance” button on the File View. You can choose the default options to create the view instance (if it is not kerberized) https://community.hortonworks.com/questions/128758/service-hdfs-check-failed-from-ambari.html?page=1&pageSize=10&sort=votes
Official HDP 3.10 installation documentation: https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.1.0/index.html
Apache Ambari Installation Document: https://docs.hortonworks.com/HDPDocuments/Ambari-126.96.36.199/bk_ambari-installation/content/ch_Getting_Ready.html
Check Hortonworks Support Matrix to make sure product versions: https://supportmatrix.hortonworks.com/
Using yum with a Proxy server https://docs.fedoraproject.org/en-US/Fedora_Core/3/html/Software_Management_Guide/sn-yum-proxy-server.html