• 《OD大数据实战》Hue环境搭建


    官网:

    http://archive.cloudera.com/cdh5/cdh/5/hue-3.7.0-cdh5.3.6/

    一、Hue环境搭建

    1. 下载

    http://archive.cloudera.com/cdh5/cdh/5/hue-3.7.0-cdh5.3.6.tar.gz

    2. 解压

    tar -zxvf hue-3.7.0-cdh5.3.6.tar.gz -C /opt/modules/cdh/

    3. 安装依赖包

    sudo yum -y install ant asciidoc cyrus-sasl-devel cyrus-sasl-gssapi gcc gcc-c++ krb5-devel libtidy libxml2-devel libxslt-devel openldap-devel python-devel sqlite-devel openssl-devel mysql-devel gmp-devel

    4. 编译安装

    cd /opt/modules/cdh/hue-3.7.0-cdh5.3.6/
    
    make apps

    5. 启动

    build/env/bin/supervisor

    首次登陆需要设置用户名和密码,为了方便,建议使用hdfs有权限的用户

    二、集成

    1. [desktop]

      # Set this to a random string, the longer the better.
      # This is used for secure hashing in the session store.
      secret_key=hue_session_store_secret_key_30_60_character
    
      # Webserver listens on this address and port
      http_host=beifeng-hadoop-02
      http_port=8888
    
      # Time zone name
      time_zone=Asia/Shanghai

    2. 集成hdfs,yarn

    1)配置hue.ini中的hdfs

    [hadoop]
    
      # Configuration for HDFS NameNode
      # ------------------------------------------------------------------------
      [[hdfs_clusters]]
        # HA support by using HttpFs
    
        [[[default]]]
          # Enter the filesystem uri
          fs_defaultfs=hdfs://beifeng-hadoop-02:9000
    
          # NameNode logical name.
          ## logical_name=
    
          # Use WebHdfs/HttpFs as the communication mechanism.
          # Domain should be the NameNode or HttpFs host.
          # Default port is 14000 for HttpFs.
    webhdfs_url=http://beifeng-hadoop-02:50070/webhdfs/v1 # webhdfs_url
    =http://beifeng-hadoop-02:14000/webhdfs/v1 # Change this if your HDFS cluster is Kerberos-secured ## security_enabled=false # Default umask for file and directory creation, specified in an octal value. ## umask=022 # Directory of the Hadoop configuration hadoop_conf_dir=/opt/modules/cdh/hadoop-2.5.0-cdh5.3.6/etc/hadoop

    2)配置hdfs-site.xml

    <configuration>
    
            <!-- 数据副本数,副本数等于所有datanode的总和 -->
            <property>
                    <name>dfs.replication</name>
                    <value>1</value>
            </property>
    
            <property>
                    <name>dfs.namenode.secondary.http-address</name>
                    <value>beifeng-hadoop-02:50090</value>
            </property>
    
            <property>
                    <name>dfs.permissions.enabled</name>
                    <value>false</value>
            </property>
    
            <property>
                  <name>dfs.webhdfs.enabled</name>
                  <value>true</value>
            </property>
            
    </configuration>

    3)配置core-site.xml

       <!-- HUI -->
       <property>
          <name>hadoop.proxyuser.hue.hosts</name>
          <value>*</value>
       </property>
       <property>
          <name>hadoop.proxyuser.hue.groups</name>
          <value>*</value>
       </property>

    4)配置httpfs-site.xml

    <configuration>
       
       <!-- HUI -->
       <property>
          <name>hadoop.proxyuser.hue.hosts</name>
          <value>*</value>
       </property>
       <property>
          <name>hadoop.proxyuser.hue.groups</name>
          <value>*</value>
       </property>
       
    </configuration>

    5)配置hue.ini中的yarn

      [[yarn_clusters]]
    
        [[[default]]]
          # Enter the host on which you are running the ResourceManager
          resourcemanager_host=beifeng-hadoop-02
    
          # The port where the ResourceManager IPC listens on
          resourcemanager_port=8032
    
          # Whether to submit jobs to this cluster
          submit_to=True
    
          # Resource Manager logical name (required for HA)
          ## logical_name=
    
          # Change this if your YARN cluster is Kerberos-secured
          ## security_enabled=false
    
          # URL of the ResourceManager API
          resourcemanager_api_url=http://beifeng-hadoop-02:8088
    
          # URL of the ProxyServer API
          proxy_api_url=http://beifeng-hadoop-02:8088
    
          # URL of the HistoryServer API
          istory_server_api_url=http://beifeng-hadoop-02:19888

    5)重启hdfs集群

    6)启动httpfs

    sbin/httpfs.sh start 

    3. 集成hive

    1)修改hui中的beeswax

      # Host where HiveServer2 is running.
      # If Kerberos security is enabled, use fully-qualified domain name (FQDN).
      hive_server_host=beifeng-hadoop-02
    
      # Port where HiveServer2 Thrift server runs on.
      hive_server_port=10000
    
      # Hive configuration directory, where hive-site.xml is located
      hive_conf_dir=/opt/modules/cdh/hive-0.13.1-cdh5.3.6/conf
    
      # Timeout in seconds for thrift calls to Hive service
      server_conn_timeout=120

    2)修改hive-site.xml

    <property>
      <name>hive.server2.authentication</name>
      <value>NOSASL</value>
      <description>
        Client authentication types.
           NONE: no authentication check
           LDAP: LDAP/AD based authentication
           KERBEROS: Kerberos/GSSAPI authentication
           CUSTOM: Custom authentication provider
                   (Use with property hive.server2.custom.authentication.class)
           PAM: Pluggable authentication module.
      </description>
    </property>

    3)重新启动hiveserver2

    nohup hive --service metastore > ~/hive_metastore.run.log 2>&1 &
    nohup hive --service hiveserver2 > ~/hiveserver2.run.log 2>&1 &

    4)使用hue检验hive 

    4. 集成oozie

    1)在oozie-site.xml添加以下配置

        <!-- Default proxyuser configuration for Hue -->
        <property>
            <name>oozie.service.ProxyUserService.proxyuser.hue.hosts</name>
            <value>*</value>
        </property>
    
        <property>
            <name>oozie.service.ProxyUserService.proxyuser.hue.groups</name>
            <value>*</value>
        </property>

    2)在hue.ini中启用oozie的配置

    [liboozie]
      # The URL where the Oozie service runs on. This is required in order for
      # users to submit jobs. Empty value disables the config check.
      oozie_url=http://beifeng-hadoop-02:11000/oozie
    
      # Requires FQDN in oozie_url if enabled
      ## security_enabled=false
    
      # Location on HDFS where the workflows/coordinator are deployed when submitted.
      remote_deployement_dir=/user/hue/oozie/deployments
    
    
    ###########################################################################
    # Settings to configure the Oozie app
    ###########################################################################
    
    [oozie]
      # Location on local FS where the examples are stored.
      ## local_data_dir=..../examples
    
      # Location on local FS where the data for the examples is stored.
      ## sample_data_dir=...thirdparty/sample_data
    
      # Location on HDFS where the oozie examples and workflows are stored.
      remote_data_dir=/user/hue/oozie/workspaces
    
      # Maximum of Oozie workflows or coodinators to retrieve in one API call.
      oozie_jobs_count=100
    
      # Use Cron format for defining the frequency of a Coordinator instead of the old frequency number/unit.
      enable_cron_scheduling=true

    3)解决问题/user/oozie/share/lib Oozie 分享库 (Oozie Share Lib) 无法安装到默认位置。

    (1)修改oozie-site.xml

        <property>
            <name>oozie.service.WorkflowAppService.system.libpath</name>
            <value>/user/ooozie/share/lib</value>
            <description>
                System library path to use for workflow applications.
                This path is added to workflow application if their job properties sets
                the property 'oozie.use.system.libpath' to true.
            </description>
        </property>

    (2)将共享依赖包解压上传hdfs的/user/oozie/share/lib

    bin/oozie-setup.sh sharelib create -fs hdfs://beifeng-hadoop-02:9000/ -locallib oozie-sharelib-4.0.0-cdh5.3.6-yarn.tar.gz

    (3)重新启动oozie

    (4)重新启动hue

    5. 集成HBase

    1)修改hue.ini中HBase相关配置

    [hbase]
      # Comma-separated list of HBase Thrift servers for clusters in the format of '(name|host:port)'.
      # Use full hostname with security.
      hbase_clusters=(HBaseCluster|beifeng-hadoop-02:9090)
    
      # HBase configuration directory, where hbase-site.xml is located.
      hbase_conf_dir=/opt/modules/cdh/hbase-0.98.6-cdh5.3.6/conf
    
      # Hard limit of rows or columns per row fetched before truncating.
      ## truncate_limit = 500
    
      # 'buffered' is the default of the HBase Thrift Server and supports security.
      # 'framed' can be used to chunk up responses,
      # which is useful when used in conjunction with the nonblocking server in Thrift.
      ## thrift_transport=buffered

    2)启动HBase

    bin/start-hbase.sh

    3)启动thrift server

    bin/hbase-daemon.sh start thrift
  • 相关阅读:
    容器适配器之queue
    STL之deque
    STL之list
    STL之multiset
    STL之multimap
    STL之set
    string
    命名空间
    Windows Live Writer教程及代码高亮工具
    STL之vector
  • 原文地址:https://www.cnblogs.com/yeahwell/p/5772844.html
Copyright © 2020-2023  润新知