Skip to content
Open
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -155,7 +155,7 @@ $(IMAGE_DIRS): %: %/Dockerfile | check-links
export TESTED_IMAGE=$* && \
cd test && \
docker-compose up -t 0 -d hadoop-master && \
time docker-compose run -e EXPECTED_CAPABILITIES="`cat ../$*/capabilities.txt | tr '\n' ' '`" --rm test-runner
time docker-compose run -e EXPECTED_CAPABILITIES="`cat ../$*/capabilities.txt | tr '\n' ' '`" -e IMAGE=$* --rm test-runner

#
# Static pattern rule to pull docker images that are external dependencies of
Expand Down
70 changes: 70 additions & 0 deletions teradatalabs/mapr52-base/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# Copyright 2017 Teradata
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

FROM teradatalabs/centos6-java8-oracle
MAINTAINER Teradata Docker Team <docker@teradata.com>

# ADD REPO FOR MAPR
ADD files/maprtech.repo /etc/yum.repos.d/maprtech.repo
COPY files/id_rsa.pub /root/
RUN yum update -y \
# ... GET MapRGPG KEY
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing space

&& rpm --import http://package.mapr.com/releases/pub/maprgpg.key \

# INSTALL UTILITY SOFTWARE
&& yum install -y iputils vim openssh-server openssh-clients sudo lsof \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's not needed, until proven otherwise

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apart from vim all others are required

# CONFIGURE SSH
&& chkconfig sshd on \
&& grep -rl '#Port 22' /etc/ssh/sshd_config | xargs sed -i 's/#Port 22/Port 22/g' \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is that needed? The other containers expose sshd as well and they don't seem to be sed-ing the 22 Port in config anywhere AFAIR?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its not working without that , other containers will all have same problems when some one will ssh hadoop-master from outside.

&& service sshd start \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why would that be needed? This is during image build, the newly started container will have sshd down anyway (except for cases when supervisor.d will spin it up)


# INSTALL MAPR
&& yum install -y mapr-fileserver mapr-nfs mapr-nodemanager mapr-cldb \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

squash into a single yum install invocation

&& yum install -y mapr-zookeeper mapr-resourcemanager mapr-historyserver \
&& yum install -y mapr-webserver mapr-gateway mapr-httpfs \

# ADD USERS AND CHANGE OWNERSHIPS
&& adduser mapr \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extract a function 'setup_user' that adds and configures a single user, then call 3 times for the users

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArturGajowy do you want the function to be created inside docker file or seperate shell script which will be called from docker file

&& adduser hive \
&& adduser hdfs \
&& touch /home/mapr /home/hive /home/hdfs \
&& echo "cd /home/mapr" >> /home/mapr/.bashrc \
&& echo "cd /home/hive" >> /home/hive/.bashrc \
&& echo "cd /home/hdfs" >> /home/hdfs/.bashrc \
&& chown -R mapr:mapr /home/mapr /opt/mapr/httpfs \
&& chown hive:hive /home/hive \
&& chown hdfs:hdfs /home/hdfs \
# CONFIGURE ZOOKEEPER'S DATA DIRECTORY
&& rm -rf /opt/mapr/zkdata \
&& mkdir /opt/mapr/zkdata \
&& chmod 777 /opt/mapr/zkdata \
&& mkdir -p /mapr \

# INSTALL PYTHON AND SUPERVISORD
&& yum install -y python-setuptools \
&& easy_install pip \
&& pip install supervisor \
&& mkdir /etc/supervisord.d/ \
# ... AND ITS MISSING DEPENDENCY
&& wget http://dl.fedoraproject.org/pub/epel/6/x86_64/python-meld3-0.6.7-1.el6.x86_64.rpm \
&& rpm -ihv python-meld3-0.6.7-1.el6.x86_64.rpm \
&& rm python-meld3-0.6.7-1.el6.x86_64.rpm \

# CLEANUP
&& yum -y clean all && rm -rf /tmp/* /var/tmp/* \

# GENERATE SSH KEYS
&& ssh-keygen -t rsa -b 4096 -C "automation@teradata.com" -N "" -f /root/.ssh/id_rsa \
&& cp /root/.ssh/id_rsa.pub /root/.ssh/authorized_keys \
&& cat /root/id_rsa.pub | cat >> ~/.ssh/authorized_keys
9 changes: 9 additions & 0 deletions teradatalabs/mapr52-base/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# mapr52-base
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add badges


Docker image with all MapR related softwares installed and there dependencies.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

software (it's uncountable)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

their


## Oracle license

By using this image, you accept the Oracle Binary Code License Agreement for Java SE available here:
[http://www.oracle.com/technetwork/java/javase/terms/license/index.html](http://www.oracle.com/technetwork/java/javase/terms/license/index.html)
1 change: 1 addition & 0 deletions teradatalabs/mapr52-base/files/id_rsa.pub
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEA1PL4EwRZFy1ewBTa4a1TK+mQ4rAupOeZsiqir/su61dAGvC6pEFAa+Litj6ub6NvcBRMAdXeBtbOnQpInE7BFwKVhwU3n60Mc69SjLiozK3Oxh9sfmbJv/JdELRS5aB9x82Y0bO5fZFPFj7SxPNMugQQMEMQHW01wsa5nJR2pYLwCtu7yoD6fQ0TJEsRqWwyQTNoR19yzL6h7p/hq9SqiqCKfsHWK4+Tj0IgF7Nwz8i+BqqOq2kC9lTRuT8HalNbqVVQ6iI+ER7FgdfSZtKKX6R9SOaKQ7p0Dt6JLFibMNhjwt5EKHsgfMOsl1G8SEncDREtTng8/JLlvIhiqmWzwQ== root@d57cdb1934d1
13 changes: 13 additions & 0 deletions teradatalabs/mapr52-base/files/maprtech.repo
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
[maprtech]
name=MapR Technologies
baseurl=http://package.mapr.com/releases/v5.2.0/redhat/
enabled=1
gpgcheck=0
protect=1

[maprecosystem]
name=MapR Technologies
baseurl=http://package.mapr.com/releases/MEP/MEP-1.0/redhat
enabled=1
gpgcheck=0
protect=1
87 changes: 87 additions & 0 deletions teradatalabs/mapr52-hive-kerberized/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
# Copyright 2017 Teradata
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

FROM teradatalabs/mapr52-hive
MAINTAINER Teradata Docker Team <docker@teradata.com>

# REMOVE UNNECESSARY FILES
RUN rm -rf /opt/mapr/conf/ssl_truststore \
&& rm -rf /opt/mapr/conf/maprserverticket \
&& rm -rf /opt/mapr/conf/cldb.key \
&& rm -rf /opt/mapr/conf/ssl_keystore \
&& rm -rf /root/bootstrap.sh \

# INSTALL KERBEROS
&& yum install -y krb5-libs krb5-server krb5-workstation

# ADD KERBEROS CONFIGURATION
ADD files/bootstrap.sh /root/
ADD files/kerberos/krb5.conf /etc/krb5.conf
ADD files/kerberos/kdc.conf /var/kerberos/krb5kdc/kdc.conf
ADD files/kerberos/kadm5.acl /var/kerberos/krb5kdc/kadm5.acl
ADD files/jceJars/local_policy.jar /usr/java/jdk1.8.0_102/jre/lib/security/local_policy.jar
ADD files/jceJars/US_export_policy.jar /usr/java/jdk1.8.0_102/jre/lib/security/US_export_policy.jar

# ENABLE HIVE SECURITY
ADD files/conf/hive-site.xml /opt/mapr/hive/hive-1.2/conf/hive-site.xml

# CREATE KERBEROS DATABASE
RUN /usr/sbin/kdb5_util create -s -P password \
&& usermod -g root hdfs \
&& usermod -g mapr hdfs \
# CREATE MAPR AND HIVE PRINCIPALS AND KEYTABS
&& /usr/sbin/kadmin.local -q "addprinc -randkey mapr/mycluster@LABS.TERADATA.COM" \
&& /usr/sbin/kadmin.local -q "xst -norandkey -k /opt/mapr/conf/mapr.keytab mapr/mycluster@LABS.TERADATA.COM" \
&& /usr/sbin/kadmin.local -q "addprinc -randkey hive/mycluster@LABS.TERADATA.COM" \
&& /usr/sbin/kadmin.local -q "xst -norandkey -k /opt/mapr/conf/hive.keytab hive/mycluster@LABS.TERADATA.COM" \

# CREATE HDFS USER
&& /usr/sbin/kadmin.local -q "addprinc -randkey hdfs/mycluster@LABS.TERADATA.COM" \
&& /usr/sbin/kadmin.local -q "xst -norandkey -k /opt/mapr/conf/hdfs.keytab hdfs/mycluster@LABS.TERADATA.COM" \

# CHANGE THE PERMISSIONS AND OWNERSHIPS FOR KEYTABS
&& chmod 644 /opt/mapr/conf/hive.keytab /opt/mapr/conf/mapr.keytab /opt/mapr/conf/hdfs.keytab \
&& chmod 777 /root/bootstrap.sh \
&& chown mapr:mapr /opt/mapr/conf/mapr.keytab \
&& chown hive:hive /opt/mapr/conf/hive.keytab \
&& chown hdfs:hdfs /opt/mapr/conf/hdfs.keytab \

# CREATE PRESTO PRINCIPAL AND KEYTAB
&& /usr/sbin/kadmin.local -q "addprinc -randkey presto-server/presto-master.docker.cluster@LABS.TERADATA.COM" \
&& /usr/sbin/kadmin.local -q "addprinc -randkey presto-client/presto-master.docker.cluster@LABS.TERADATA.COM" \
&& /usr/sbin/kadmin.local -q "addprinc -randkey hive/presto-master.docker.cluster@LABS.TERADATA.COM" \
&& mkdir -p /etc/presto/conf \
&& /usr/sbin/kadmin.local -q "xst -norandkey -k /etc/presto/conf/presto-server.keytab presto-server/presto-master.docker.cluster" \
&& /usr/sbin/kadmin.local -q "xst -norandkey -k /etc/presto/conf/presto-client.keytab presto-client/presto-master.docker.cluster" \
&& /usr/sbin/kadmin.local -q "xst -norandkey -k /etc/presto/conf/hive-presto-master.keytab hive/presto-master.docker.cluster" \
&& chmod 644 /etc/presto/conf/*.keytab \
&& cat /opt/mapr/conf/env.sh | sed -e '0,/MAPR_HIVE_SERVER_LOGIN_OPTS="-Dhadoop.login=maprsasl_keytab"/ s/MAPR_HIVE_SERVER_LOGIN_OPTS="-Dhadoop.login=maprsasl_keytab"/MAPR_HIVE_SERVER_LOGIN_OPTS="-Dhadoop.login=hybrid"/' > env_new.sh \
&& cat env_new.sh | sed -e '0,/MAPR_HIVE_LOGIN_OPTS="-Dhadoop.login=maprsasl"/ s/MAPR_HIVE_LOGIN_OPTS="-Dhadoop.login=maprsasl"/MAPR_HIVE_LOGIN_OPTS="-Dhadoop.login=hybrid"/' > /opt/mapr/conf/env.sh \
&& rm -rf env_new.sh

# CREATE SSL KEYSTORE
RUN keytool -genkeypair \
-alias presto \
-keyalg RSA \
-keystore /etc/presto/conf/keystore.jks \
-keypass password \
-storepass password \
-dname "CN=presto-master, OU=, O=, L=, S=, C="
RUN chmod 644 /etc/presto/conf/keystore.jks

# EXPOSE KERBEROS PORTS
EXPOSE 88
EXPOSE 749

CMD /root/startup.sh
17 changes: 17 additions & 0 deletions teradatalabs/mapr52-hive-kerberized/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# mapr52-hive-kerberized

Docker image with kerberos enabled for MapR. Please note that running services have lower memory heap size set.
For more details please check the [hadoop-env.sh](files/conf/hadoop-env.sh) configuration file.
If you want to work on larger datasets please tune those settings accordingly, the current settings should be optimal
for general correctness testing.

## Run

```
$ docker run --privileged -d --name hadoop-master -h hadoop-master teradatalabs/mapr52-hive-kerberized
```

## Oracle license

By using this image, you accept the Oracle Binary Code License Agreement for Java SE available here:
[http://www.oracle.com/technetwork/java/javase/terms/license/index.html](http://www.oracle.com/technetwork/java/javase/terms/license/index.html)
49 changes: 49 additions & 0 deletions teradatalabs/mapr52-hive-kerberized/files/bootstrap.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
#!/bin/sh


# START SSHD AND THE SOCKS PROXY FOR THE HIVE METASTORE
supervisorctl start sshd
supervisorctl start socks-proxy

# CONFIGURE MAPR
/opt/mapr/server/configure.sh -N mycluster -Z localhost -C localhost -HS localhost -no-autostart

# SETUP DISK FOR MAPR BY RUNNING disksetup
/opt/mapr/server/disksetup -M -F /root/disk.txt

# CREATE HIVE PROXY USERS
chmod 755 /opt/mapr/conf/proxy

# CONFIGURE HIVE
/opt/mapr/server/configure.sh -R

# ENABLE SECURITY IN MAPR
/opt/mapr/server/configure.sh -secure -genkeys -C localhost -Z localhost -N mycluster -no-autostart

# START KERBEROS SERVICES
/sbin/service krb5kdc start
/sbin/service kadmin start

# START MAPR SERVICES
service mapr-zookeeper start
service mapr-warden start

# WAIT FOR WARDEN TO START ALL THE SERVICES
sh /root/wardenTracker.sh

# START HTTPFS SERVICES
maprcli node services -name httpfs -action start -nodes $(hostname)

# CREATE KERBEROS TICKET
kinit -kt /opt/mapr/conf/mapr.keytab mapr/mycluster@LABS.TERADATA.COM

# CREATE MAPR TICKET
maprlogin kerberos -user mapr/mycluster@LABS.TERADATA.COM

# RUN HDFS COMMANDS
hadoop fs -mkdir /user/root /user/hive /user/hdfs /user/hive/warehouse /var /var/mapr /var/mapr/cluster /var/mapr/cluster/yarn /var/mapr/cluster/yarn/rm /var/mapr/cluster/yarn/rm/staging /var/mapr/cluster/yarn/rm/staging/hive
hadoop fs -chmod 777 /user/hive /user/hdfs /user/hive/warehouse /var/mapr /var/mapr/cluster/yarn/rm/staging/hive

# REMOVE MAPR TICKET AND KERBEROS TICKET
kdestroy
rm -rf /tmp/*
94 changes: 94 additions & 0 deletions teradatalabs/mapr52-hive-kerberized/files/conf/hive-site.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->

<configuration>
<property>
<name>datanucleus.schema.autoCreateAll</name>
<value>true</value>
<description>creates necessary schema on a startup if one doesn't exist. set
this to false, after creating it once</description>
</property>

<property>
<name>hive.server2.enable.doAs</name>
<value>true</value>
<description>Set this property to enable impersonation in Hive Server 2</description>
</property>

<property>
<name>hive.metastore.execute.setugi</name>
<value>true</value>
<description>Set this property to enable Hive Metastore service impersonation in unsecure mode. In unsecure mode, setting this property to true will cause the metastore to execute DFS operations using the client's reported user and group permissions. Note that this property must be set on both the client and server sides. If the client sets it to true and the server sets it to false, the client setting will be ignored.</description>
</property>

<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>

<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>

<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>username to use against metastore database</description>
</property>

<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>root</value>
<description>password to use against metastore database</description>
</property>

<property>
<name>hive.metastore.uris</name>
<value>thrift://localhost:9083</value>
</property>

<!-- Configuration for Kerberos -->

<property>
<name>hive.metastore.kerberos.keytab.file</name>
<value>/opt/mapr/conf/hive.keytab</value>
<description>The path to the Kerberos Keytab file containing the metastore thrift server's service principal.</description>
</property>
<property>
<name>hive.metastore.kerberos.principal</name>
<value>hive/mycluster@LABS.TERADATA.COM</value>
<description>The service principal for the metastore thrift server. The special string _HOST will be replaced automatically with the correct hostname.</description>
</property>
<property>
<name>hive.server2.authentication</name>
<value>KERBEROS</value>
<description>authenticationtype</description>
</property>
<property>
<name>hive.server2.authentication.kerberos.principal</name>
<value>hive/mycluster@LABS.TERADATA.COM</value>
<description>HiveServer2 principal. If _HOST is used as the FQDN portion, it will be replaced with the actual hostname of the running instance.</description>
</property>
<property>
<name>hive.server2.authentication.kerberos.keytab</name>
<value>/opt/mapr/conf/hive.keytab</value>
<description>Keytab file for HiveServer2 principal</description>
</property>
</configuration>
Binary file not shown.
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
*/admin@LABS.TERADATA.COM *
12 changes: 12 additions & 0 deletions teradatalabs/mapr52-hive-kerberized/files/kerberos/kdc.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
[kdcdefaults]
kdc_ports = 88
kdc_tcp_ports = 88

[realms]
LABS.TERADATA.COM = {
#master_key_type = aes256-cts
acl_file = /var/kerberos/krb5kdc/kadm5.acl
dict_file = /usr/share/dict/words
admin_keytab = /var/kerberos/krb5kdc/kadm5.keytab
supported_enctypes = aes256-cts:normal aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal des-hmac-sha1:normal des-cbc-md5:normal des-cbc-crc:normal
}
18 changes: 18 additions & 0 deletions teradatalabs/mapr52-hive-kerberized/files/kerberos/krb5.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
[logging]
default = FILE:/var/log/krb5libs.log
kdc = FILE:/var/log/krb5kdc.log
admin_server = FILE:/var/log/kadmind.log

[libdefaults]
default_realm = LABS.TERADATA.COM
dns_lookup_realm = false
dns_lookup_kdc = false
ticket_lifetime = 24h
renew_lifetime = 7d
forwardable = true

[realms]
LABS.TERADATA.COM = {
kdc = hadoop-master
admin_server = hadoop-master
}
Loading