Hue3.9.0-cdh5.14.0安装

hue的安装需要准备如下的前置条件
1、Java-jdk8+
2、python2.6+
3、安装时非root用户
4、另行安装maven和ant，并配置HOME
5、hue安装节点需要hadoop、hive的配置文件
6、准备一个mysql库来做hue的元数据库

第一步：前置准备中的Javajdk和python准备本简单就不演示了，maven和ant的包，解压既安装，配置好HOME测试版本无问题即可

export MAVEN_HOME=/pot/apache-maven-3.3.9
export ANT_HOME=/opt/apache-ant-1.8.1
export PATH=$PATH:$MAVEN_HOME/bin
export PATH=$PATH:$ANT_HOME/bin

在这里插入图片描述
第二步：安装hue需要的第三方环境

yum install -y asciidoc cyrus-sasl-devel cyrus-sasl-gssapi cyrus-sasl-plain gcc gcc-c++ krb5-devel libffi-devel libtidy libxml2-devel libxslt-devel make mysql mysql-devel openldap-devel python-devel sqlite-devel openssl-devel gmp-devel

第三步：编译hue，首先将hue的安装包解压

tar -zxvf hue-3.9.0-cdh5.14.0.tar.gz
mv hue-3.9.0-cdh5.14.0 hue3.9

随后进入hue的HOME目录里面，编译hue，编译后会生成一个desktop路径，下面称为安装目录。

cd hue3.9
make apps

编译大概会输出很多东西，中途会卡顿，无需额外操作，大概需要3分钟左右，中途不报错就行

第四步：新建系统用户hue，将整个hue赋权给该用户，下面的操作都使用该用户操作

useradd hue
passwd hue
输入密码

chown -R hue:hue /opt/hue3.9/
su hue

第五步：配置desktop/conf/hue.ini文件，找到并修改如下内容，除了secret_key其他的改成你自己的，内容大概在配置文件中的20行左右

secret_key=jFE93j;2[290-eiw.KEiwN2s3['d;/.q[eIW^y#e=+Iei*@Mn<qW5o
http_host=hdp3
http_port=8888
time_zone=Asia/Shanghai

在这里插入图片描述
第六步：运行启动命令，初步启动hue，确保服务自身没问题

build/env/bin/supervisor

服务正常的话，会输出如下的服务信息


[hue@hdp3 hue3.9]$ ./build/env/bin/supervisor 
[INFO] Not running as root, skipping privilege drop
starting server with options:
{'daemonize': False,
 'host': '0.0.0.0',
 'pidfile': None,
 'port': 8888,
 'server_group': 'hue',
 'server_name': 'localhost',
 'server_user': 'hue',
 'ssl_certificate': None,
 'ssl_certificate_chain': None,
 'ssl_cipher_list': 'ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-DSS-AES128-GCM-SHA256:kEDH+AESGCM:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA256:DHE-RSA-AES256-SHA256:DHE-DSS-AES256-SHA:DHE-RSA-AES256-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:AES:CAMELLIA:DES-CBC3-SHA:!aNULL:!eNULL:!EXPORT:!DES:!RC4:!MD5:!PSK:!aECDH:!EDH-DSS-DES-CBC3-SHA:!EDH-RSA-DES-CBC3-SHA:!KRB5-DES-CBC3-SHA',
 'ssl_private_key': None,
 'threads': 50,
 'workdir': None}

此时访问安装节点的8888端口，出现如下界面即可，不要着急下一步，回到安装步骤上，把前台的服务Ctrl+C关掉
在这里插入图片描述
第七步：Hadoop需要一些Hue的配置支持，修改hdfs-site.xml追加如下内容

<!-- 打开hadoop的hdfs-web操作支持 -->
<property>
    <name>dfs.webhdfs.enabled</name>
    <value>true</value>
</property>

<!-- 关掉权限检查 -->
<property>
    <name>dfs.permissions.enabled</name>
    <value>false</value>
</property>

修改core-site.xml，追加如下内容

<!-- 给hue用户鉴权，告诉hadoop所有节点（*）都可以通过hue用户身份代理的形式操作hdfs -->
<property>
    <name>hadoop.proxyuser.hue.hosts</name>
    <value>*</value>
</property>
<property>
    <name>hadoop.proxyuser.hue.groups</name>
    <value>*</value>
</property>

如果你的Hadoop是高可用的，需要在httpfs-site.xml文件中追加如下配置

<!-- 所有的节点都能访问httpfs服务 -->
<property>
    <name>httpfs.proxyuser.httpfs.hosts</name>
    <value>*</value>
</property>
<property>
    <name>httpfs.proxyuser.httpfs.groups</name>
    <value>*</value>
</property>

区别是：WebHDFS是HDFS内置的组件，已经运行于NameNode和DataNode中。对HDFS文件的读写，将会重定向到文件所在的DataNode，并且会完全利用HDFS的带宽。HttpFS是独立于HDFS的一个服务。对HDFS文件的读写，将会通过它进行中转，它能限制带宽占用，且hue关联高可用hdfs走的是HttpFS。

用scp同步Hadoop节点之间的配置

第八步：开始配置Hue的hdfs配置，但是这里有个问题，后面说，修改hue.ini文件，找打文件的850行左右，修改hadoop-hdfs的配置
在这里插入图片描述
下面的yarn配置也要改，而且注意改的时候有的要解开注释，然后yarn的HA有一些配置需要复制

[hadoop]

  # Configuration for HDFS NameNode
  # ------------------------------------------------------------------------
  [[hdfs_clusters]]
    # HA support by using HttpFs

    [[[default]]]
      # Enter the filesystem uri
      fs_defaultfs=hdfs://hdp1

      # 如果是高可用的需要再次定义NN通道名
      logical_name=hdp1

      # Use WebHdfs/HttpFs as the communication mechanism.
      # Domain should be the NameNode or HttpFs host.
      # Default port is 14000 for HttpFs.
      webhdfs_url=http://hdp1/webhdfs/v1

      # Change this if your HDFS cluster is Kerberos-secured
      ## security_enabled=false

      # In secure mode (HTTPS), if SSL certificates from YARN Rest APIs
      # have to be verified against certificate authority
      ## ssl_cert_ca_verify=True

      # Directory of the Hadoop configuration
      hadoop_conf_dir=/opt/hadoop-2.7.2/etc/hadoop

  # Configuration for YARN (MR2)
  # ------------------------------------------------------------------------
  [[yarn_clusters]]

    [[[default]]]
      # Enter the host on which you are running the ResourceManager
      resourcemanager_host=hdp2

      # The port where the ResourceManager IPC listens on
      resourcemanager_port=8032

      # 是否将作业提交到此群集，并监控作业执行情况
      submit_to=True

      # 如果是高可用这里要指定RN通道名
      logical_name=yrc

      # Change this if your YARN cluster is Kerberos-secured
      ## security_enabled=false

      # URL of the ResourceManager API
      resourcemanager_api_url=http://hdp2:8088

      # URL of the ProxyServer API
      proxy_api_url=http://hdp2:8088

      # 历史服务器
      history_server_api_url=http://hdp3:19888

      # spark的历史服务器地址
      spark_history_server_url=http://hdp3:18088

      # In secure mode (HTTPS), if SSL certificates from YARN Rest APIs
      # have to be verified against certificate authority
      ## ssl_cert_ca_verify=True

    # HA support by specifying multiple clusters.
    # Redefine different properties there.
    # e.g.

    [[[ha]]]
      # Resource Manager logical name (required for HA)
	  logical_name=yrc

      # Un-comment to enable
      ## submit_to=True

      # URL of the ResourceManager API
      resourcemanager_api_url=http://hdp3:8088

      # URL of the ProxyServer API
      proxy_api_url=http://hdp3:8088
	  
	  # Enter the host on which you are running the ResourceManager
      resourcemanager_host=hdp3

      # The port where the ResourceManager IPC listens on
      resourcemanager_port=8032

第九步：配置hive，找到hue配置文件中的[beeswax]配置内容
在这里插入图片描述
第十步：修改hue的元数据库，首先在准备好的mysql服务中新建数据库并且赋权

CREATE DATABASE hue DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON hue.* TO 'hue'@'%' IDENTIFIED BY 'Hue123!';
flush privileges;

修后修改hue的配置文件，找到[database]，修改如下属性
在这里插入图片描述

把mysql的driver包，放到安装目录的desktop/libs下面

[hue@hdp3 libs]$ ll |grep mysql
-rw-r--r-- 1 hue hue 872303 10月 29 22:13 mysql-connector-java-5.1.27-bin.jar

随后在安装目录下，执行初始化命令

./build/env/bin/hue syncdb

然后回出现下面的输出，跟着提示走

[hue@hdp3 hue3.9]$ ./build/env/bin/hue syncdb
Syncing...
Creating tables ...
Creating table auth_permission
Creating table auth_group_permissions
Creating table auth_group
Creating table auth_user_groups
Creating table auth_user_user_permissions
Creating table auth_user
Creating table django_openid_auth_nonce
Creating table django_openid_auth_association
Creating table django_openid_auth_useropenid
Creating table django_content_type
Creating table django_session
Creating table django_site
Creating table django_admin_log
Creating table south_migrationhistory
Creating table axes_accessattempt
Creating table axes_accesslog

You just installed Django's auth system, which means you don't have any superusers defined.
Would you like to create one now? (yes/no): yes
Username (leave blank to use 'hue'): hue           #这一步是在设置hue的管理用户就是8888web的管理
Email address:       #邮箱不用输，直接回车
Password:            #这里是hue管理用户的密码
Password (again): 
Superuser created successfully.
Installing custom SQL ...
Installing indexes ...
Installed 0 object(s) from 0 fixture(s)

Synced:
 > django.contrib.auth
 > django_openid_auth
 > django.contrib.contenttypes
 > django.contrib.sessions
 > django.contrib.sites
 > django.contrib.staticfiles
 > django.contrib.admin
 > south
 > axes
 > about
 > filebrowser
 > help
 > impala
 > jobbrowser
 > metastore
 > proxy
 > rdbms
 > zookeeper
 > indexer
 > dashboard

Not synced (use migrations):
 - django_extensions
 - desktop
 - beeswax
 - hbase
 - jobsub
 - oozie
 - pig
 - search
 - security
 - spark
 - sqoop
 - useradmin
 - notebook
(use ./manage.py migrate to migrate these)

在上面的输出结束并出现(use ./manage.py migrate to migrate these)的提示之后，运行下面的命令

./build/env/bin/hue migrate

会输出很长一段的日志，不用管

[hue@hdp3 hue3.9]$ ./build/env/bin/hue migrate
Running migrations for django_extensions:
 - Migrating forwards to 0001_empty.
 > django_extensions:0001_empty
 - Loading initial data for django_extensions.
Installed 0 object(s) from 0 fixture(s)
Running migrations for desktop:
 - Migrating forwards to 0027_truncate_documents.
 > pig:0001_initial
 > oozie:0001_initial
 > oozie:0002_auto__add_hive
 > oozie:0003_auto__add_sqoop
 > oozie:0004_auto__add_ssh
 > oozie:0005_auto__add_shell
 > oozie:0006_auto__chg_field_java_files__chg_field_java_archives__chg_field_sqoop_f
 > oozie:0007_auto__chg_field_sqoop_script_path
 > oozie:0008_auto__add_distcp
 > oozie:0009_auto__add_decision
 > oozie:0010_auto__add_fs
 > oozie:0011_auto__add_email
 > oozie:0012_auto__add_subworkflow__chg_field_email_subject__chg_field_email_body
 > oozie:0013_auto__add_generic
 > oozie:0014_auto__add_decisionend
 > oozie:0015_auto__add_field_dataset_advanced_start_instance__add_field_dataset_ins
 > oozie:0016_auto__add_field_coordinator_job_properties
 > oozie:0017_auto__add_bundledcoordinator__add_bundle
 > oozie:0018_auto__add_field_workflow_managed
 > oozie:0019_auto__add_field_java_capture_output
 > oozie:0020_chg_large_varchars_to_textfields
 > oozie:0021_auto__chg_field_java_args__add_field_job_is_trashed
 > oozie:0022_auto__chg_field_mapreduce_node_ptr__chg_field_start_node_ptr
 > oozie:0022_change_examples_path_format
 - Migration 'oozie:0022_change_examples_path_format' is marked for no-dry-run.
 > oozie:0023_auto__add_field_node_data__add_field_job_data
 > oozie:0024_auto__chg_field_subworkflow_sub_workflow
 > oozie:0025_change_examples_path_format
 - Migration 'oozie:0025_change_examples_path_format' is marked for no-dry-run.
 > desktop:0001_initial
 > desktop:0002_add_groups_and_homedirs
 > desktop:0003_group_permissions
 > desktop:0004_grouprelations
 > desktop:0005_settings
 > desktop:0006_settings_add_tour
 > beeswax:0001_initial
 > beeswax:0002_auto__add_field_queryhistory_notify
 > beeswax:0003_auto__add_field_queryhistory_server_name__add_field_queryhistory_serve
 > beeswax:0004_auto__add_session__add_field_queryhistory_server_type__add_field_query
 > beeswax:0005_auto__add_field_queryhistory_statement_number
 > beeswax:0006_auto__add_field_session_application
 > beeswax:0007_auto__add_field_savedquery_is_trashed
 > beeswax:0008_auto__add_field_queryhistory_query_type
 > beeswax:0009_auto__add_field_savedquery_is_redacted__add_field_queryhistory_is_reda
 > desktop:0007_auto__add_documentpermission__add_documenttag__add_document
 > desktop:0008_documentpermission_m2m_tables
 > desktop:0009_auto__chg_field_document_name
 > desktop:0010_auto__add_document2__chg_field_userpreferences_key__chg_field_userpref
 > desktop:0011_auto__chg_field_document2_uuid
 > desktop:0012_auto__chg_field_documentpermission_perms
 > desktop:0013_auto__add_unique_documenttag_owner_tag
 > desktop:0014_auto__add_unique_document_content_type_object_id
 > desktop:0015_auto__add_unique_documentpermission_doc_perms
 > desktop:0016_auto__add_unique_document2_uuid_version_is_history
 > desktop:0017_auto__add_document2permission__add_unique_document2permission_doc_perm
 > desktop:0018_auto__add_field_document2_parent_directory
 > desktop:0019_auto
 > desktop:0020_auto__del_field_document2permission_all
 > desktop:0021_auto__add_defaultconfiguration__add_unique_defaultconfiguration_app_is
 > desktop:0022_auto__del_field_defaultconfiguration_group__del_unique_defaultconfigur
 > desktop:0023_auto__del_unique_defaultconfiguration_app_is_default_user__add_field_d
 > desktop:0024_auto__add_field_document2_is_managed
 > desktop:0025_auto__add_field_document2_is_trashed
 > desktop:0026_change_is_trashed_default_to_false
 - Migration 'desktop:0026_change_is_trashed_default_to_false' is marked for no-dry-run.
 > desktop:0027_truncate_documents
 - Loading initial data for desktop.
Installed 0 object(s) from 0 fixture(s)
Running migrations for beeswax:
 - Migrating forwards to 0014_auto__add_field_queryhistory_is_cleared.
 > beeswax:0009_auto__chg_field_queryhistory_server_port
 > beeswax:0010_merge_database_state
 > beeswax:0011_auto__chg_field_savedquery_name
 > beeswax:0012_auto__add_field_queryhistory_extra
 > beeswax:0013_auto__add_field_session_properties
 > beeswax:0014_auto__add_field_queryhistory_is_cleared
 - Loading initial data for beeswax.
Installed 0 object(s) from 0 fixture(s)
Running migrations for hbase:
 - Migrating forwards to 0001_initial.
 > hbase:0001_initial
 - Loading initial data for hbase.
Installed 0 object(s) from 0 fixture(s)
Running migrations for jobsub:
 - Migrating forwards to 0006_chg_varchars_to_textfields.
 > jobsub:0001_initial
 > jobsub:0002_auto__add_ooziestreamingaction__add_oozieaction__add_oozieworkflow__ad
 > jobsub:0003_convertCharFieldtoTextField
 > jobsub:0004_hue1_to_hue2
 - Migration 'jobsub:0004_hue1_to_hue2' is marked for no-dry-run.
 > jobsub:0005_unify_with_oozie
 - Migration 'jobsub:0005_unify_with_oozie' is marked for no-dry-run.
 > jobsub:0006_chg_varchars_to_textfields
 - Loading initial data for jobsub.
Installed 0 object(s) from 0 fixture(s)
Running migrations for oozie:
 - Migrating forwards to 0027_auto__chg_field_node_name__chg_field_job_name.
 > oozie:0026_set_default_data_values
 - Migration 'oozie:0026_set_default_data_values' is marked for no-dry-run.
 > oozie:0027_auto__chg_field_node_name__chg_field_job_name
 - Loading initial data for oozie.
Installed 0 object(s) from 0 fixture(s)
Running migrations for pig:
- Nothing to migrate.
 - Loading initial data for pig.
Installed 0 object(s) from 0 fixture(s)
Running migrations for search:
 - Migrating forwards to 0003_auto__add_field_collection_owner.
 > search:0001_initial
 > search:0002_auto__del_core__add_collection
 > search:0003_auto__add_field_collection_owner
 - Loading initial data for search.
Installed 0 object(s) from 0 fixture(s)
? You have no migrations for the 'security' app. You might want some.
Running migrations for spark:
 - Migrating forwards to 0001_initial.
 > spark:0001_initial
 - Loading initial data for spark.
Installed 0 object(s) from 0 fixture(s)
Running migrations for sqoop:
 - Migrating forwards to 0001_initial.
 > sqoop:0001_initial
 - Loading initial data for sqoop.
Installed 0 object(s) from 0 fixture(s)
Running migrations for useradmin:
 - Migrating forwards to 0008_convert_documents.
 > useradmin:0001_permissions_and_profiles
 - Migration 'useradmin:0001_permissions_and_profiles' is marked for no-dry-run.
 > useradmin:0002_add_ldap_support
 - Migration 'useradmin:0002_add_ldap_support' is marked for no-dry-run.
 > useradmin:0003_remove_metastore_readonly_huepermission
 - Migration 'useradmin:0003_remove_metastore_readonly_huepermission' is marked for no-dry-run.
 > useradmin:0004_add_field_UserProfile_first_login
 > useradmin:0005_auto__add_field_userprofile_last_activity
 > useradmin:0006_auto__add_index_userprofile_last_activity
 > useradmin:0007_remove_s3_access
 > useradmin:0008_convert_documents
 - Migration 'useradmin:0008_convert_documents' is marked for no-dry-run.
Starting document conversions...

Finished running document conversions.

 - Loading initial data for useradmin.
Installed 0 object(s) from 0 fixture(s)
Running migrations for notebook:
 - Migrating forwards to 0001_initial.
 > notebook:0001_initial
 - Loading initial data for notebook.
Installed 0 object(s) from 0 fixture(s)

随后看准备好的mysql库，正常出出现很多表
在这里插入图片描述

现在，启动Hadoop和Hiveserver2，启动hadoop的时候，要单独运行sbin/httpfs.sh start命令启动hadoop的https服务，最后就可以重启hue，访问8888端口了
在这里插入图片描述
正常使用你初始化元数据库时，创建的管理用户登录就行，一切正常的话，你就可以运行hive语句了，并且右上角可以查看yarn上的任务日志

最后要说一点，就是我前面配置hdfs的时候说的这里有个问题，当你点击页面的时候不难发现，你只能看到yarn任务的列表，无法点击直接跳到日志详情去，而且从左侧预览hdfs文件也会报一个oozie的错误，这是因为hue本身访问hdfs文件依赖的是oozie这个软件，个人测试或者学习安装一般都是cdh版本的hue，而oozie无论是本身的运行逻辑，还是cdh影响这些都是很大的不可控因素，比如cdh它的hdfs默认用户以及相关权限底层控制的都很复杂，所以我们直接使用这个组件的话，不太现实，所以我个人建议，用hue写Hivesql就行，毕竟有提示，至于其他的操作，其实本身上hue就不太行，推荐大家用zeppelin，我自己的测试集群用的就是，并且就是因为zeppelin写代码相关的时候没有提示，所以才留着hue的，不然我早删了它了。