Hue 3.5安装

下载Hue-3.5.0

wget https://dl.dropboxusercontent.com/u/730827/hue/releases/3.5.0/hue-3.5.0.tgz

解压

sudo tar zxvf hue-3.5.0.tgz

解压后的目录为:

scott@master:/var/tmp/hue-3.5.0$ ll
总用量 80
drwxrwxr-x 8 scott scott 4096 12月 6 07:18 ./
drwxrwxrwt 5 root root 4096 4月 10 20:16 ../
drwxrwxr-x 20 scott scott 4096 12月 6 07:18 apps/
drwxrwxr-x 5 scott scott 4096 12月 6 07:18 desktop/
drwxrwxr-x 6 scott scott 4096 12月 6 07:18 docs/
drwxrwxr-x 3 scott scott 4096 12月 6 07:18 ext/
-rw-rw-r-- 1 scott scott 11358 12月 6 07:12 LICENSE.txt
-rw-rw-r-- 1 scott scott 4715 12月 6 07:18 Makefile
-rw-rw-r-- 1 scott scott 8505 12月 6 07:12 Makefile.sdk
-rw-rw-r-- 1 scott scott 3652 12月 6 07:12 Makefile.vars
-rw-rw-r-- 1 scott scott 2192 12月 6 07:18 Makefile.vars.priv
drwxrwxr-x 2 scott scott 4096 12月 6 07:13 maven/
-rw-rw-r-- 1 scott scott 1517 12月 6 07:18 README
drwxrwxr-x 4 scott scott 4096 12月 6 07:18 tools/
-rw-rw-r-- 1 scott scott 932 12月 6 07:12 VERSION

安装所需依赖

sudo apt-get update
sudo apt-get install libxml2-dev libxslt-dev libsasl2-dev libsasl2-modules-gssapi-mit libmysqlclient-dev python-dev python-setuptools python-simplejson libsqlite3-dev libsasl2-dev libsasl2-modules-gssapi-mit libtidy-0.99-0 libkrb5-dev libldap2-dev

安装

进入hue-3.5.0解压目录,执行以下命令安装,将安装在/opt目录下

PREFIX=/opt make install

安装后的目录结构为:

scott@master:/var/tmp/hue-3.5.0$ cd /opt/hue/
scott@master:/opt/hue$ ll
总用量 84
drwxrwxr-x 7 scott scott 4096 4月 10 20:49 ./
drwxr-xr-x 20 scott scott 4096 4月 10 20:48 ../
-rw-rw-r-- 1 scott scott 2633 4月 10 20:49 app.reg
drwxrwxr-x 21 scott scott 4096 4月 10 20:49 apps/
drwxrwxr-x 3 scott scott 4096 4月 10 20:48 build/
drwxrwxr-x 5 scott scott 4096 4月 10 20:49 desktop/
drwxrwxr-x 3 scott scott 4096 12月 6 07:18 ext/
-rw-rw-r-- 1 scott scott 11358 12月 6 07:12 LICENSE.txt
-rw-rw-r-- 1 scott scott 4715 12月 6 07:18 Makefile
-rw-rw-r-- 1 scott scott 44 4月 10 20:48 Makefile.buildvars
-rw-rw-r-- 1 scott scott 8505 12月 6 07:12 Makefile.sdk
-rw-rw-r-- 1 scott scott 3652 12月 6 07:12 Makefile.vars
-rw-rw-r-- 1 scott scott 2192 12月 6 07:18 Makefile.vars.priv
-rw-rw-r-- 1 scott scott 1517 12月 6 07:18 README
drwxrwxr-x 4 scott scott 4096 4月 10 20:48 tools/
-rw-rw-r-- 1 scott scott 932 12月 6 07:12 VERSION
scott@master:/opt/hue$

若安装时报类似错误则

solution:

32位

sudo ln -fs /usr/lib/python2.7/plat-i386-linux-gnu/_sysconfigdata_nd.py /usr/lib/python2.7/

64位

sudo ln -fs /usr/lib/python2.7/plat-x86_64-linux-gnu/_sysconfigdata_nd.py /usr/lib/python2.7/

similar errors:

http://www.marshut.com/irrnwm/hue-make-error.html#irtpyi
https://issues.cloudera.org/browse/HUE-1672
https://github.com/pypa/virtualenv/issues/410

查看配置

scott@master:/opt/hue$ build/env/bin/hue config_help | less

创建hue用户

sudo useradd -s /sbin/nologin hue

详细配置

修改/opt/hue/desktop/conf/hue.ini

# Hue configuration file
# ===================================
#
# For complete documentation about the contents of this file, run
# $ <hue_root>/build/env/bin/hue config_help
#
# All .ini files under the current directory are treated equally. Their
# contents are merged to form the Hue configuration, which can
# can be viewed on the Hue at
# http://<hue_host>:<port>/dump_config
###########################################################################
# General configuration for core Desktop features (authentication, etc)
###########################################################################
[desktop]
# Set this to a random string, the longer the better.
# This is used for secure hashing in the session store.
secret_key=
# Webserver listens on this address and port
http_host=master
http_port=8888
# Time zone name
time_zone=America/Los_Angeles
# Enable or disable Django debug mode.
django_debug_mode=false
# Enable or disable backtrace for server error
http_500_debug_mode=false
# Server email for internal error messages
## django_server_email='hue@localhost.localdomain'
# Email backend
## django_email_backend=django.core.mail.backends.smtp.EmailBackend
# Webserver runs as this user
server_user=hue
server_group=hue
# If set to false, runcpserver will not actually start the web server.
# Used if Apache is being used as a WSGI container.
## enable_server=yes
# Number of threads used by the CherryPy web server
## cherrypy_server_threads=10
# Filename of SSL Certificate
## ssl_certificate=
# Filename of SSL RSA Private Key
## ssl_private_key=
# Default encoding for site data
default_site_encoding=utf-8
# Help improve Hue with anonymous usage analytics.
# Use Google Analytics to see how many times an application or specific section of an application is used, nothing more.
## collect_usage=true
## Comma-separated list of regular expressions, which match the redirect URL.
## For example, to restrict to your local domain and FQDN, the following value can be used:
## ^\/.*$,^http:\/\/www.mydomain.com\/.*$
# redirect_whitelist=
# Administrators
# ----------------
[[django_admins]]
## [[[admin1]]]
## name=john
## email=john@doe.com
# UI customizations
# -------------------
[[custom]]
# Top banner HTML code
## banner_top_html=
# Configuration options for user authentication into the web application
# ------------------------------------------------------------------------
[[auth]]
# Authentication backend. Common settings are:
# - django.contrib.auth.backends.ModelBackend (entirely Django backend)
# - desktop.auth.backend.AllowAllBackend (allows everyone)
# - desktop.auth.backend.AllowFirstUserDjangoBackend
# (Default. Relies on Django and user manager, after the first login)
# - desktop.auth.backend.LdapBackend
# - desktop.auth.backend.PamBackend
# - desktop.auth.backend.SpnegoDjangoBackend
# - desktop.auth.backend.RemoteUserDjangoBackend
# - desktop.auth.backend.OAuthBackend
# - libsaml.backend.SAML2Backend
## backend=desktop.auth.backend.AllowFirstUserDjangoBackend
# Backend to synchronize user-group membership with
## user_group_membership_synchronization_backend=desktop.auth.backend.LdapSynchronizationBackend
## pam_service=login
# When using the desktop.auth.backend.RemoteUserDjangoBackend, this sets
# the normalized name of the header that contains the remote user.
# The HTTP header in the request is converted to a key by converting
# all characters to uppercase, replacing any hyphens with underscores
# and adding an HTTP_ prefix to the name. So, for example, if the header
# is called Remote-User that would be configured as HTTP_REMOTE_USER
#
# Defaults to HTTP_REMOTE_USER
## remote_user_header=HTTP_REMOTE_USER
# Configuration options for connecting to LDAP and Active Directory
# -------------------------------------------------------------------
[[ldap]]
# The search base for finding users and groups
## base_dn="DC=mycompany,DC=com"
# The NT domain to connect to (only for use with Active Directory)
## nt_domain=mycompany.com
# URL of the LDAP server
## ldap_url=ldap://auth.mycompany.com
# A PEM-format file containing certificates for the CA's that
# Hue will trust for authentication over TLS.
# The certificate for the CA that signed the
# LDAP server certificate must be included among these certificates.
# See more here http://www.openldap.org/doc/admin24/tls.html.
## ldap_cert=
## use_start_tls=true
# Distinguished name of the user to bind as -- not necessary if the LDAP server
# supports anonymous searches
## bind_dn="CN=ServiceAccount,DC=mycompany,DC=com"
# Password of the bind user -- not necessary if the LDAP server supports
# anonymous searches
## bind_password=
# Pattern for searching for usernames -- Use <username> for the parameter
# For use when using LdapBackend for Hue authentication
## ldap_username_pattern="uid=<username>,ou=People,dc=mycompany,dc=com"
# Create users in Hue when they try to login with their LDAP credentials
# For use when using LdapBackend for Hue authentication
## create_users_on_login = true
# Ignore the case of usernames when searching for existing users in Hue.
## ignore_username_case=false
# Ignore the case of usernames when searching for existing users in Hue.
## force_username_lowercase=false
# Use search bind authentication.
## search_bind_authentication=true
[[[users]]]
# Base filter for searching for users
## user_filter="objectclass=*"
# The username attribute in the LDAP schema
## user_name_attr=sAMAccountName
[[[groups]]]
# Base filter for searching for groups
## group_filter="objectclass=*"
# The group name attribute in the LDAP schema
## group_name_attr=cn
# The attribute of the group object which identifies the members of the group
## group_member_attr=members
# Configuration options for specifying the Desktop Database. For more info,
# see http://docs.djangoproject.com/en/1.1/ref/settings/#database-engine
# ------------------------------------------------------------------------
[[database]]
# Database engine is typically one of:
# postgresql_psycopg2, mysql, or sqlite3
#
# Note that for sqlite3, 'name', below is a filename;
# for other backends, it is the database name.
## engine=sqlite3
## host=
## port=
## user=
## password=
## name=desktop/desktop.db
## options={}
# Configuration options for specifying the Desktop session.
# For more info, see https://docs.djangoproject.com/en/1.4/topics/http/sessions/
# ------------------------------------------------------------------------
[[session]]
# The cookie containing the users' session ID will expire after this amount of time in seconds.
## ttl=60*60*24*14
# The cookie containing the users' session ID will be secure.
# Should only be enabled with HTTPS.
## secure=false
# The cookie containing the users' session ID will use the HTTP only flag.
## http_only=false
# Configuration options for connecting to an external SMTP server
# ------------------------------------------------------------------------
[[smtp]]
# The SMTP server information for email notification delivery
host=localhost
port=25
user=
password=
# Whether to use a TLS (secure) connection when talking to the SMTP server
tls=no
# Default email address to use for various automated notification from Hue
## default_from_email=hue@localhost
# Configuration options for Kerberos integration for secured Hadoop clusters
# ------------------------------------------------------------------------
[[kerberos]]
# Path to Hue's Kerberos keytab file
## hue_keytab=
# Kerberos principal name for Hue
## hue_principal=hue/hostname.foo.com
# Path to kinit
## kinit_path=/path/to/kinit
# Configuration options for using OAuthBackend login
# ------------------------------------------------------------------------
[[oauth]]
# The Consumer key of the application
## consumer_key=XXXXXXXXXXXXXXXXXXXXX
# The Consumer secret of the application
## consumer_secret=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
# The Request token URL
## request_token_url=https://api.twitter.com/oauth/request_token
# The Access token URL
## access_token_url=https://api.twitter.com/oauth/access_token
# The Authorize URL
## authenticate_url=https://api.twitter.com/oauth/authorize
###########################################################################
# Settings to configure SAML
###########################################################################
[libsaml]
# Xmlsec1 binary path. This program should be executable by the user running Hue.
## xmlsec_binary=/usr/local/bin/xmlsec1
# Entity ID for Hue acting as service provider.
# Can also accept a pattern where '<base_url>' will be replaced with server URL base.
## entity_id="<base_url>/saml2/metadata/"
# Create users from SSO on login.
## create_users_on_login=true
# Required attributes to ask for from IdP.
# This requires a comma separated list.
## required_attributes=uid
# Optional attributes to ask for from IdP.
# This requires a comma separated list.
## optional_attributes=
# IdP metadata in the form of a file. This is generally an XML file containing metadata that the Identity Provider generates.
## metadata_file=
# Private key to encrypt metadata with.
## key_file=
# Signed certificate to send along with encrypted metadata.
## cert_file=
# A mapping from attributes in the response from the IdP to django user attributes.
## user_attribute_mapping={'uid':'username'}
# Have Hue initiated authn requests be signed and provide a certificate.
## authn_requests_signed=false
# Have Hue initiated logout requests be signed and provide a certificate.
## logout_requests_signed=false
## Username can be sourced from 'attributes' or 'nameid'.
## username_source=attributes
# Performs the logout or not.
## logout_enabled=true
###########################################################################
# Settings to configure your Hadoop cluster.
###########################################################################
[hadoop]
# Configuration for HDFS NameNode
# ------------------------------------------------------------------------
[[hdfs_clusters]]
# HA support by using HttpFs
[[[default]]]
# Enter the filesystem uri
fs_defaultfs=hdfs://master:9000
# NameNode logical name.
logical_name=hdfs://master:9000
# Use WebHdfs/HttpFs as the communication mechanism.
# Domain should be the NameNode or HttpFs host.
# Default port is 14000 for HttpFs.
webhdfs_url=http://master:50070/webhdfs/v1
# Change this if your HDFS cluster is Kerberos-secured
## security_enabled=false
# Configuration for YARN (MR2)
# ------------------------------------------------------------------------
[[yarn_clusters]]
[[[default]]]
# Enter the host on which you are running the ResourceManager
resourcemanager_host=master
# The port where the ResourceManager IPC listens on
resourcemanager_port=8032
# Whether to submit jobs to this cluster
submit_to=True
# Change this if your YARN cluster is Kerberos-secured
## security_enabled=false
# URL of the ResourceManager API
resourcemanager_api_url=http://master:8088
# URL of the ProxyServer API
proxy_api_url=http://master:8088
# URL of the HistoryServer API
history_server_api_url=http://master:19888
# Configuration for MapReduce (MR1)
# ------------------------------------------------------------------------
[[mapred_clusters]]
[[[default]]]
# Enter the host on which you are running the Hadoop JobTracker
## jobtracker_host=localhost
# The port where the JobTracker IPC listens on
## jobtracker_port=8021
# JobTracker logical name.
## logical_name=
# Thrift plug-in port for the JobTracker
## thrift_port=9290
# Whether to submit jobs to this cluster
submit_to=False
# Change this if your MapReduce cluster is Kerberos-secured
## security_enabled=false
# HA support by specifying multiple clusters
# e.g.
# [[[ha]]]
# Enter the host on which you are running the failover JobTracker
# jobtracker_host=localhost-ha
###########################################################################
# Settings to configure liboozie
###########################################################################
[liboozie]
# The URL where the Oozie service runs on. This is required in order for
# users to submit jobs.
oozie_url=http://master:11000/oozie
# Requires FQDN in oozie_url if enabled
## security_enabled=false
# Location on HDFS where the workflows/coordinator are deployed when submitted.
## remote_deployement_dir=/user/hue/oozie/deployments
###########################################################################
# Settings to configure the Oozie app
###########################################################################
[oozie]
# Location on local FS where the examples are stored.
## local_data_dir=..../examples
# Location on local FS where the data for the examples is stored.
## sample_data_dir=...thirdparty/sample_data
# Location on HDFS where the oozie examples and workflows are stored.
## remote_data_dir=/user/hue/oozie/workspaces
# Maximum of Oozie workflows or coodinators to retrieve in one API call.
## oozie_jobs_count=100
###########################################################################
# Settings to configure Beeswax with Hive
###########################################################################
[beeswax]
# Host where Hive server Thrift daemon is running.
# If Kerberos security is enabled, use fully-qualified domain name (FQDN).
hive_server_host=master
# Port where HiveServer2 Thrift server runs on.
hive_server_port=10000
# Hive configuration directory, where hive-site.xml is located
hive_conf_dir=/opt/hive-0.11.0/conf
# Timeout in seconds for thrift calls to Hive service
server_conn_timeout=120
# Path to HiveServer2 start script
hive_server_bin= /opt/hive-0.11.0/bin/hiveserver2
# Set a LIMIT clause when browsing a partitioned table.
# A positive value will be set as the LIMIT. If 0 or negative, do not set any limit.
## browse_partitioned_table_limit=250
[[ssl]]
# SSL communication enabled for this server.
## enabled=false
# Path to Certificate Authority certificates.
## cacerts=/etc/hue/cacerts.pem
# Path to the private key file.
## key=/etc/hue/key.pem
# Path to the public certificate file.
## cert=/etc/hue/cert.pem
# Choose whether Hue should validate certificates received from the server.
## validate=true
###########################################################################
# Settings to configure Pig
###########################################################################
[pig]
# Location of piggybank.jar on local filesystem.
## local_sample_dir=/usr/share/hue/apps/pig/examples
# Location piggybank.jar will be copied to in HDFS.
## remote_data_dir=/user/hue/pig/examples
###########################################################################
# Settings to configure Sqoop
###########################################################################
[sqoop]
# Sqoop server URL
server_url=http://master:12000/sqoop
###########################################################################
# Settings to configure Proxy
###########################################################################
[proxy]
# Comma-separated list of regular expressions,
# which match 'host:port' of requested proxy target.
## whitelist=(localhost|127\.0\.0\.1):(50030|50070|50060|50075)
# Comma-separated list of regular expressions,
# which match any prefix of 'host:port/path' of requested proxy target.
# This does not support matching GET parameters.
## blacklist=()
###########################################################################
# Settings to configure Impala
###########################################################################
[impala]
# Host of the Impala Server (one of the Impalad)
## server_host=localhost
# Port of the Impala Server
## server_port=21050
# Kerberos principal
## impala_principal=impala/hostname.foo.com
# Turn on/off impersonation mechanism when talking to Impala
## impersonation_enabled=False
###########################################################################
# Settings to configure Hbase
###########################################################################
[hbase]
# Comma-separated list of HBase Thrift servers for
# clusters in the format of '(name|host:port)'.
## hbase_clusters=(Cluster|localhost:9090)
# Hard limit of rows or columns per row fetched before truncating.
## truncate_limit = 500
###########################################################################
# Settings to configure Solr Search
###########################################################################
[search]
# URL of the Solr Server
## solr_url=http://localhost:8983/solr/
# Requires FQDN in solr_url if enabled
## security_enabled=false
## Query sent when no term is entered
## empty_query=*:*
###########################################################################
# Settings to configure Job Designer
###########################################################################
[jobsub]
# Location on local FS where examples and template are stored.
## local_data_dir=..../data
# Location on local FS where sample data is stored
## sample_data_dir=...thirdparty/sample_data
###########################################################################
# Settings to configure Job Browser.
###########################################################################
[jobbrowser]
# Share submitted jobs information with all users. If set to false,
# submitted jobs are visible only to the owner and administrators.
## share_jobs=true
###########################################################################
# Settings to configure the Zookeeper application.
###########################################################################
[zookeeper]
[[clusters]]
[[[default]]]
# Zookeeper ensemble. Comma separated list of Host/Port.
# e.g. localhost:2181,localhost:2182,localhost:2183
## host_ports=localhost:2181
# The URL of the REST contrib service (required for znode browsing)
## rest_url=http://localhost:9998
###########################################################################
# Settings to configure the Spark application.
###########################################################################
[spark]
# Address of the Spark master, e.g spark://localhost:7077. If empty use the current configuration.
# Can be overriden in the script too.
## spark_master=
# Local path to Spark Home on all the nodes of the cluster.
## spark_home=/usr/lib/spark
###########################################################################
# Settings for the User Admin application
###########################################################################
[useradmin]
# The name of the default user group that users will be a member of
default_user_group=supergroup
###########################################################################
# Settings for the RDBMS application
###########################################################################
[rdbms]
# The RDBMS app can have any number of databases configured in the databases
# section. A database is known by its section name
# (IE sqlite, mysql, psql, and oracle in the list below).
[[databases]]
# sqlite configuration.
## [[[sqlite]]]
# Name to show in the UI.
## nice_name=SQLite
# For SQLite, name defines the path to the database.
## name=/tmp/sqlite.db
# Database backend to use.
## engine=sqlite
# mysql, oracle, or postgresql configuration.
## [[[mysql]]]
# Name to show in the UI.
## nice_name="My SQL DB"
# For MySQL and PostgreSQL, name is the name of the database.
# For Oracle, Name is instance of the Oracle server. For express edition
# this is 'xe' by default.
## name=mysqldb
# Database backend to use. This can be:
# 1. mysql
# 2. postgresql
# 3. oracle
## engine=mysql
# IP or hostname of the database to connect to.
## host=localhost
# Port the database server is listening to. Defaults are:
# 1. MySQL: 3306
# 2. PostgreSQL: 5432
# 3. Oracle Express Edition: 1521
## port=3306
# Username to authenticate with when connecting to the database.
## user=example
# Password matching the username to authenticate with when
# connecting to the database.
## password=example

修改hadoop配置文件

  • hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
  • core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/scott/hadoop-${user.name}</value>
</property>
<property>
<name>hadoop.proxyuser.hue.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hue.groups</name>
<value>*</value>
</property>
</configuration>
  • httpfs-site.xml
<configuration>
<property>
<name>httpfs.proxyuser.hue.hosts</name>
<value>*</value>
</property>
<property>
<name>httpfs.proxyuser.hue.groups</name>
<value>*</value>
</property>
</configuration>

设置环境变量

export HUE_HOME=/opt/hue
export PATH=$HUE_HOME/build/env/bin:$PATH

启动

supervisor

启动日志

[INFO] Not running as root, skipping privilege drop
starting server with options {'ssl_certificate': None, 'workdir': None, 'server_name': 'localhost', 'host': 'master', 'daemonize': False, 'threads': 10, 'pidfile': None, 'ssl_private_key': None, 'server_group': 'hue', 'ssl_cipher_list': 'DEFAULT:!aNULL:!eNULL:!LOW:!EXPORT:!SSLv2', 'port': 8888, 'server_user': 'hue'}

这里通过日志发现我们是以root用户身份启动成功

http://master:8888/

第一次登录需要我们注册一个用户,这里注册hue用户密码也为hue