HUE安装与使用

1、介绍

HUE是一个开源的Apache Hadoop UI系统，早期由Cloudera开发，后来贡献给开源社区。它是基于Python Web框架Django实现的。通过使用Hue我们可以通过浏览器方式操纵Hadoop集群。例如put、get、执行MapReduce Job等等。

2、安装

2.1 安装hue依赖的第三方包

#安装xml软件包
$>sudo yum install -y libxml2-devel.x86_64

#安装其他软件包
$>sudo yum install -y libxslt-devel.x86_64 python-devel openldap-devel asciidoc cyrus-sasl-gssapi

1527152006500

3、配置hue

hue与hadoop连接，即访问hadoop文件，可以使用两种方式。

WebHDFS

提供高速数据传输，client可以直接和DataNode通信。
HttpFS

一个代理服务，方便于集群外部的系统进行集成。注意：HA模式下只能使用该中方式。

3.1 配置hadoop的hue代理用户

[/soft/hadoop/etc/hadoop/core-site.xml]

注意：hadoop的代理用户配置方式是：hadoop.proxyuser.${superuser}.hosts，这里我的superuser是centos。

<property>
 <name>hadoop.proxyuser.centos.hosts</name>
    <value>*</value>
</property>
<property>
 <name>hadoop.proxyuser.centos.groups</name>
 <value>*</value>
</property>

[/soft/hadoop/etc/hadoop/hdfs-site.xml]

<property>
 <name>dfs.webhdfs.enabled</name>
 <value>true</value>
</property>

[/soft/hadoop/etc/hadoop/httpfs-site.xml]

<property>
 <name>httpfs.proxyuser.centos.hosts</name>
 <value>*</value>
</property>
<property>
     <name>httpfs.proxyuser.centos.groups</name>
 <value>*</value>
</property>

分发配置文件

$>cd /soft/hadoop/etc/hadoop
$>xsync.sh core-site.xml
$>xsync.sh hdfs-site.xml
$>xsync.sh httpfs-site.xml

3.2 重启hadoop和yarn进程

$>stop-dfs.sh
$>stop-dfs.sh

$>start-dfs.sh
$>start-yarn.sh

3.3 启动httpfs进程

3.3.1 启动进程

$>/soft/hadoop/sbin/httpfs.sh start

3.3.2 检查14000端口

$>netstat -anop |grep 14000

1527152006500

3.4 配置hue文件

这里我们使用的是hadoop的namenode HA模式，因此只能配置httpfs方式访问hdfs文件。需要注意的是webhdfs_url指定的是14000的端口，具体如下所示。

[/home/centos/hue-3.12.0/desktop/conf/hue.ini]

...
    [[[default]]]
      # Enter the filesystem uri
      fs_defaultfs=hdfs://mycluster:8020

      # NameNode logical name.
      logical_name=mycluster

      # Use WebHdfs/HttpFs as the communication mechanism.
      # Domain should be the NameNode or HttpFs host.
      # Default port is 14000 for HttpFs.
      webhdfs_url=http://s101:14000/webhdfs/v1

      # Change this if your HDFS cluster is Kerberos-secured
      ## security_enabled=false

      # In secure mode (HTTPS), if SSL certificates from YARN Rest APIs
      # have to be verified against certificate authority
      ## ssl_cert_ca_verify=True

      # Directory of the Hadoop configuration
      hadoop_conf_dir=/soft/hadoop/etc/hadoop

3.5 配置hue的数据库为mysql

...
    [[database]]
    # Database engine is typically one of:
    # postgresql_psycopg2, mysql, sqlite3 or oracle.
    #
    # Note that for sqlite3, 'name', below is a path to the filename. For other backends, it is the database name
    # Note for Oracle, options={"threaded":true} must be set in order to avoid crashes.
    # Note for Oracle, you can use the Oracle Service Name by setting "host=" and "port=" and then "name=<host>:<port>/<service_name>".
    # Note for MariaDB use the 'mysql' engine.
    engine=mysql
    host=192.168.231.1
    port=3306
    user=root
    password=root
    # Execute this script to produce the database password. This will be used when 'password' is not set.
    ## password_script=/path/script
    name=hue
    ## options={}
    # Database schema, to be used only when public schema is revoked in postgres
    ## schema=

4、初始化mysql库，生成表

4.1 创建hue库

因为我们在hue.ini文件中指定的数据库名为hue，因此需要先创建hue数据库。

msyql>create database hue ;

4.2 初始化数据表

该步骤是创建表和插入部分数据。hue的初始化数据表命令由hue/bin/hue syncdb完成，创建期间，需要输入用户名和密码。如下所示：

#同步数据库
$>~/hue-3.12.0/build/env/bin/hue syncdb
#导入数据,主要包括oozie、pig、desktop所需要的表
$>~/hue-3.12.0/build/env/bin/hue migrate

1527152006500

4.3 查看mysql中是否生成表

查看是否在mysql中生成了所需要的表，截图如下所示：

msyql>show tables ;

1527152006500

5、启动hue进程

$>~/hue-3.12.0/build/env/bin/supervisor

启动过程如下图所示：

1527152006500

6、检查webui

http://s101:8888/

打开登录界面，输入前文创建的账户即可。

1527152006500

7、访问hdfs

点击右上角的hdfs链接，进入hdfs系统画面。

1527152006500

8、配置ResourceManager

8.1 修改hue.ini配置文件

  [[yarn_clusters]]
    ...
    # [[[ha]]]
      # Resource Manager logical name (required for HA)
      logical_name=cluster1

      # Un-comment to enable
      ## submit_to=True

      # URL of the ResourceManager API
      resourcemanager_api_url=http://s101:8088

8.2 查看job执行情况

1527152006500

9、配置hive

9.1 编写hue.ini文件

[beeswax]
  # Host where HiveServer2 is running.
  # If Kerberos security is enabled, use fully-qualified domain name (FQDN).
  hive_server_host=s101

  # Port where HiveServer2 Thrift server runs on.
  hive_server_port=10000

  # Hive configuration directory, where hive-site.xml is located
  hive_conf_dir=/soft/hive/conf

9.2 安装依赖软件包

如果不安装以下的依赖包，会导致sasl方面的错误，说hiveserver2没有启动。

$>sudo yum install -y cyrus-sasl-plain  cyrus-sasl-devel  cyrus-sasl-gssapi

9.3 启动hiveserver2服务器

$>/soft/hive/bin/hiveserver2

9.4 查看webui

1527152006500

10、配置hbase

10.1 修改hue.ini配置文件

hbase配置的是thriftserver2服务器地址，不是master地址，而且需要用小括号包起来。thriftserver需要单独启动。

[hbase]
  # Comma-separated list of HBase Thrift servers for clusters in the format of '(name|host:port)'.
  # Use full hostname with security.
  # If using Kerberos we assume GSSAPI SASL, not PLAIN.
  hbase_clusters=(s101:9090)

  # HBase configuration directory, where hbase-site.xml is located.
  hbase_conf_dir=/soft/hbase/conf

10.2 启动thriftserver服务器

注意：thriftserver服务器启动的名称是thrift。切记：有些文档上写的是thrit2，这里是thrfit。

$>hbase-daemon.sh start thrift

10.3 查看端口9090

1527152006500

10.4 查看hue中hbase

1527152006500

11、配置spark

11.1 介绍

hue与spark的集成使用livy server进行中转，livy server类似于hive server2。提供一套基于restful风格的服务，接受client提交http的请求，然后转发给spark集群。livy server不在spark的发行包中，需要单独下载。

注意：hue中通过netebook编写scala或者python程序，要确保notebook可以使用，需要启动hadoop的httpfs进程--切记！

注意下载使用较高的版本，否则有些类找不到。下载地址如下：

http://mirrors.tuna.tsinghua.edu.cn/apache/incubator/livy/0.5.0-incubating/livy-0.5.0-incubating-bin.zip

11.2 解压

$>unzip livy-server-0.2.0.zip -d /soft/

11.3 启动livy服务器

$>/soft/livy-server-0.2.0/bin/live-server

1527152006500

11.4 配置hue

推荐使用local或yarn模式启动job，这里我们配置成spark://s101:7077。

[spark]
  # Host address of the Livy Server.
  livy_server_host=s101

  # Port of the Livy Server.
  livy_server_port=8998

  # Configure Livy to start in local 'process' mode, or 'yarn' workers.
  livy_server_session_kind=spark://s101:7077

11.5 使用notebook编写scala程序

1527152006500

相关阅读:
re | frida | hook windows进程
 win32 | 透明窗口实现&画一个透明背景
 re | [SWPU2019]EasiestRe
re | [QCTF2018]babyre
web | [CISCN2019 总决赛 Day2 Web1]Easyweb
sql | sqlite3的sqlite_master表探究
 windows | 获取系统变量ProgramData
【友晶科技Terasic】Avalon-MM slave 为什么 readdata 要在第二个时钟周期才有数据？
友晶科技 Terasic SOC FPGA的板子提供的image 使用了几个核？是CPU0还是CPU1？
【友晶科技Terasic】用sopc-create-header-files工具生成 FPGA 硬件地址信息用于与linux 程序交互 generate_hps_qsys_header.sh
原文地址：https://www.cnblogs.com/liuys635/p/11287670.html

HUE安装与使用

1、介绍

2、安装

2.1 安装hue依赖的第三方包

3、 配置hue

3.1 配置hadoop的hue代理用户

3.2 重启hadoop和yarn进程

3.3 启动httpfs进程

3.3.1 启动进程

3.3.2 检查14000端口

3.4 配置hue文件

3.5 配置hue的数据库为mysql

4、初始化mysql库，生成表

4.1 创建hue库

4.2 初始化数据表

4.3 查看mysql中是否生成表

5、启动hue进程

6、检查webui

7、访问hdfs

8、配置ResourceManager

8.1 修改hue.ini配置文件

8.2 查看job执行情况

9、配置hive

9.1 编写hue.ini文件

9.2 安装依赖软件包

9.3 启动hiveserver2服务器

9.4 查看webui

10、配置hbase

10.1 修改hue.ini配置文件

10.2 启动thriftserver服务器

10.3 查看端口9090

10.4 查看hue中hbase

11、配置spark

11.1 介绍

11.2 解压

11.3 启动livy服务器

11.4 配置hue

11.5 使用notebook编写scala程序

3、配置hue