在CentOS7上安装Hadoop和Hive

文章目录
  1. 1. 前言
  2. 2. 安装JDK
  3. 3. 安装Hadoop
  4. 4. 配置Mysql
  5. 5. 安装Hive
  6. 6. 书签

前言

《在Ubuntu16.04上安装Hadoop》一文中,我们已经学习了在Ubuntu中安装Hadoop,因为作业需要,小编要在CentOS上也安装Hadoop。同时,还要安装Hive。

安装JDK

参考《使用lanproxy进行内网穿透》一文,安装sdk。
1、删除自带jdk

1
rpm -e --nodeps `rpm -qa | grep java`

2、查看yum库中有哪些jdk版本。
yum search java | grep jdk

3、选择java-1.8.0-openjdk-devel.x86_64 : OpenJDK Development Environment版本进行安装。
yum install java-1.8.0-openjdk-devel.x86_64

默认安装目录为/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.151-5.b12.el7_4.x86_64,或者/usr/lib/jvm/java-1.8.0-openjdk

4、配置环境变量
vim /etc/profile

在最后添加:

1
2
3
4
5
6
#set java environment
JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.151-5.b12.el7_4.x86_64
JRE_HOME=$JAVA_HOME/jre
CLASS_PATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib
PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin
export JAVA_HOME JRE_HOME CLASS_PATH PATH

5、让修改立即生效
source /etc/profile

6、查看安装结果
javajavacjava -version

安装Hadoop

1、下载解压

1
2
3
4
wget http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-1.2.1/hadoop-1.2.1.tar.gz
mv hadoop-1.2.1.tar.gz /opt/
cd /opt/
tar -zxvf hadoop-1.2.1.tar.gz

2、修改hadoop-env.sh

1
2
cd hadoop-1.2.1/conf/
vim hadoop-env.sh

主要修改JAVA_HOME如下:

1
2
# The java implementation to use.  Required.
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.151-5.b12.el7_4.x86_64

3、修改core-site.xml,内容如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/hadoop</value>
</property>

<property>
<name>dfs.name.dir</name>
<value>/hadoop/name</value>
</property>

<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
</configuration>

4、修改hdfs-site.xml,内容如下:

1
2
3
4
5
6
7
8
9
10
11
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
<name>dfs.data.dir</name>
<value>/hadoop/data</value>
</property>
</configuration>

5、修改mapred-site.xml,内容如下:

1
2
3
4
5
6
7
8
9
10
11
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
<name>mapred.job.tracker</name>
<value>master:9001</value>
</property>
</configuration>

6、修改/etc/profile,修改PATH如下:

1
2
export HADOOP_HOME=/opt/hadoop-1.2.1
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$HADOOP_HOME/bin:$PATH

生效,source /etc/profile

7、测试,hadoop,如果出现COMMAND提示,则表明安装配置成功。

8、在第7步测试时,出现“Warning: $HADOOP_HOME is deprecated.”,这是因为新版的hadoop废弃掉了HADOOP_HOME这个变量。若要除去这个警告,要么换用HADOOP_PREFIX,要么在hadoop-env.sh添加一句:

1
export HADOOP_HOME_WARN_SUPPRESS=1

9、namenode格式化,hadoop namenode -format

10、启动hadoop,cd /opt/hadoop-1.2.1/binstart-all.sh

11、查看hadoop下有哪些文件,hadoop fs -ls /

12、关闭hadoop,stop-all.sh

配置Mysql

1、假设已经安装好了mysql,版本为5.6.29。

2、新建hive数据库,用来保存hive的元数据

1
2
create database hive;
alter database hive character set latin1;

3、将hive数据库下的所有表的所有权限赋给hadoop用户,并配置mysql为hive-site.xml中的连接密码,然后刷新系统权限关系表。

1
2
3
4
5
create user 'hadoop'@'%' identified by 'mysql';

grant all privileges on *.* to 'hadoop'@'%' with grant option;

flush privileges;

安装Hive

1、下载解压hive

1
2
3
cd /opt
wget http://mirror.bit.edu.cn/apache/hive/hive-1.2.2/apache-hive-1.2.2-bin.tar.gz
tar -zxvf apache-hive-1.2.2-bin.tar.gz

2、配置HIVE_HOME,vim /etc/profile,在最后添加

1
2
export HIVE_HOME=/opt/apache-hive-1.2.2-bin
export PATH=$HIVE_HOME/bin:$PATH

立即生效,source /etc/profile

3、修改hive-site.xml文件

1
2
3
cd /opt/apache-hive-1.2.2-bin/conf/
cp hive-default.xml.template hive-site.xml
vim hive-site.xml

/ConnectionURL,找到ConnectionURL,修改为:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hadoop</value>
</property>

<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>mysql</value>
</property>

<property>
<name>hive.support.sql11.reserved.keywords</name>
<value>false</value>
<description>
This flag should be set to true to enable support for SQL2011 reserved keywords.
The default value is true.
</description>
</property>

4、下载配置mysql-connector-java

1
2
3
wget http://www.java2s.com/Code/JarDownload/mysql/mysql-connector-java-5.1.22-bin.jar.zip
unzip mysql-connector-java-5.1.22-bin.jar.zip
mv mysql-connector-java-5.1.22-bin.jar /opt/apache-hive-1.2.2-bin/lib

5、修改hive-env.sh文件

1
2
cp hive-env.sh.template  hive-env.sh
vi hive-env.sh

修改为:

1
2
# Set HADOOP_HOME to point to a specific hadoop install directory
HADOOP_HOME=/opt/hadoop-1.2.1

6、启动hadoop,cd /opt/hadoop-1.2.1/binstart-all.sh

7、启动metastore,nohup hive --service metastore > metastore.log 2>&1 &

8、启动hive,hive
报错:The root scratch dir: /tmp/hive on HDFS should be writable.

1
2
3
mkdir /tmp/hive
chmod -R 777 /tmp/hive
hadoop fs -chmod -R 777 /tmp/hive

再次启动,报错:Relative path in absolute URI: \${system:java.io.tmpdir%7D/\$%7Bsystem:user.name%7D
新建tmpdir文件夹

1
2
3
4
mkdir /tmp/tmpdir
chmod -R 777 /tmp/tmpdir
hadoop fs -mkdir /tmp/tmpdir
hadoop fs -chmod -R 777 /tmp/tmpdir

在hive-site.xml中,查找所有的${system:java.io.tmpdir}${system:java.io.tmpdir}/${system:user.name},替换为/tmp/tmpdir

再次启动,成功!

书签

hive2.1.1安装部署

Hive安装与配置

Hive集成mysql数据库

Hive安装配置指北(含Hive Metastore详解)