背景
- 为满足业务大数据架构使用多种sql引擎:spark,flink,trino(同时查询hive,clickhouse等),需要部署一个统一的sql入口,该入口满足多引擎多平台运行;
- 本次实践是上述需求的一个初始实践(后续部分正在进行),鉴于当前没有找到kyuubi on k8s的实践,所以记录一下;
基础环境(仅非k8s部分)
组件名称 | 组件版本 |
---|---|
kyuubi | v1.6.0 |
spark | v3.3.0 |
CDH | v6.2.1 |
1、创建docker镜像
创建spark3.3.0镜像
-
修改spark的配置文件
-
修改spark-env.sh文件
添加下面的内容(路径是未来在容器中路径):
export HADOOP_CONF_DIR=/opt/spark/conf:/opt/spark/conf export YARN_CONF_DIR=/opt/spark/conf:/opt/spark/conf
-
spark-defaults.conf文件保持不变
-
-
将CDH中hadoop中的配置添加到/data/spark-3.3.0/conf中(CDH的hadoop配置在etc下)
-
编辑初始化脚本run.sh(下面内容需要补充进去,***表示自定义的内容)
#方便后面解析CDH集群ip echo " ***.***.***.*** " >> /etc/hosts #kerberos认证需要的配置文件 echo "***" > /etc/krb5.conf #在镜像中进行认证操作 kinit -kt /opt/spark/work-dir/hive.keytab hive/***@****.****.****
编辑好run.sh放在/data/spark-3.3.0/路径下,同时keytab文件也放在该路径下(任意位置都行,但是放在这里最方便,下面的Dockerfile中需要用到)
-
修改docker镜像进入后的脚本文件/data/spark-3.3.0/kubernetes/dockerfiles/spark/entrypoint.sh(修改位置字体为玫红色背景)
关键内容:
- 添加driver和executor运行时初始化脚本run.sh(图方便使用了777的权限)
#!/bin/bash # # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # # echo commands to the terminal output set -ex # Check whether there is a passwd entry for the container UID #myuid=$(id -u) myuid=0 mygid=$(id -g) # turn off -e for getent because it will return error code in anonymous uid case set +e uidentry=$(getent passwd $myuid) set -e # If there is no passwd entry for the container UID, attempt to create one if [ -z "$uidentry" ] ; then if [ -w /etc/passwd ] ; then echo "$myuid:x:$myuid:$mygid:${SPARK_USER_NAME:-anonymous uid}:$SPARK_HOME:/bin/false" >> /etc/passwd else echo "Container ENTRYPOINT failed to add passwd entry for anonymous UID" fi fi if [ -z "$JAVA_HOME" ]; then JAVA_HOME=$(java -XshowSettings:properties -version 2>&1 > /dev/null | grep 'java.home' | awk '{print $3}') fi SPARK_CLASSPATH="$SPARK_CLASSPATH:${SPARK_HOME}/jars/*" env | grep SPARK_JAVA_OPT_ | sort -t_ -k4 -n | sed 's/[^=]*=\(.*\)/\1/g' > /tmp/java_opts.txt readarray -t SPARK_EXECUTOR_JAVA_OPTS < /tmp/java_opts.txt if [ -n "$SPARK_EXTRA_CLASSPATH" ]; then SPARK_CLASSPATH="$SPARK_CLASSPATH:$SPARK_EXTRA_CLASSPATH" fi if ! [ -z ${PYSPARK_PYTHON+x} ]; then export PYSPARK_PYTHON fi if ! [ -z ${PYSPARK_DRIVER_PYTHON+x} ]; then export PYSPARK_DRIVER_PYTHON fi # If HADOOP_HOME is set and SPARK_DIST_CLASSPATH is not set, set it here so Hadoop jars are available to the executor. # It does not set SPARK_DIST_CLASSPATH if already set, to avoid overriding customizations of this value from elsewhere e.g. Docker/K8s. if [ -n "${HADOOP_HOME}" ] && [ -z "${SPARK_DIST_CLASSPATH}" ]; then export SPARK_DIST_CLASSPATH="$($HADOOP_HOME/bin/hadoop classpath)" fi if ! [ -z ${HADOOP_CONF_DIR+x} ]; then SPARK_CLASSPATH="$HADOOP_CONF_DIR:$SPARK_CLASSPATH"; fi if ! [ -z ${SPARK_CONF_DIR+x} ]; then SPARK_CLASSPATH="$SPARK_CONF_DIR:$SPARK_CLASSPATH"; elif ! [ -z ${SPARK_HOME+x} ]; then SPARK_CLASSPATH="$SPARK_HOME/conf:$SPARK_CLASSPATH"; fi case "$1" in driver) shift 1 chmod 777 /opt/spark/work-dir/run.sh /bin/bash /opt/spark/work-dir/run.sh cat /etc/hosts CMD=( "$SPARK_HOME/bin/spark-submit" --conf "spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS" --deploy-mode client "$@" ) ;; executor) shift 1 chmod 777 /opt/spark/work-dir/run.sh /bin/bash /opt/spark/work-dir/run.sh cat /etc/hosts CMD=( ${JAVA_HOME}/bin/java "${SPARK_EXECUTOR_JAVA_OPTS[@]}" -Xms$SPARK_EXECUTOR_MEMORY -Xmx$SPARK_EXECUTOR_MEMORY -cp "$SPARK_CLASSPATH:$SPARK_DIST_CLASSPATH" org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend --driver-url $SPARK_DRIVER_URL --executor-id $SPARK_EXECUTOR_ID --cores $SPARK_EXECUTOR_CORES --app-id $SPARK_APPLICATION_ID --hostname $SPARK_EXECUTOR_POD_IP --resourceProfileId $SPARK_RESOURCE_PROFILE_ID --podName $SPARK_EXECUTOR_POD_NAME ) ;; *) echo "Non-spark-on-k8s command provided, proceeding in pass-through mode..." CMD=("$@") ;; esac # Execute the container CMD under tini for better hygiene exec /usr/bin/tini -s -- "${CMD[@]}"
-
编辑/data/spark-3.3.0/kubernetes/dockerfiles/spark/Dockerfile
关键点:
- 修改openjdk的源(也可以不修改,但是网络不好的话镜像拉取不下来)
- 修改拉取debian的源(原因同上)
- 安装vim sudo net-tools lsof bash tini libc6 libpam-modules krb5-user libpam-krb5 libpam-ccreds libkrb5-dev libnss3 procps等软件(方便后续在容器中进行操作)
- 复制cong下文件到/opt/spark/conf下
- 复制keytab文件到/opt/spark/work-dir路径下
- 复制初始化脚本run.sh,用来在镜像拉起后进行修改/etc/hosts文件
- 设置spark_uid为0(root)(目的是需要更改hosts文件)
# contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # ARG java_image_tag=8-jre-slim FROM ***.***.***.***/bigdata/openjdk:${java_image_tag} #ARG spark_uid=185 ARG spark_uid=0 # Before building the docker image, first build and make a Spark distribution following # the instructions in https://spark.apache.org/docs/latest/building-spark.html. # If this docker file is being used in the context of building your images from a Spark # distribution, the docker build command should be invoked from the top level directory # of the Spark distribution. E.g.: # docker build -t spark:latest -f kubernetes/dockerfiles/spark/Dockerfile . RUN set -ex && \ sed -i 's/http:\/\/deb.\(.*\)/https:\/\/deb.\1/g' /etc/apt/sources.list && \ sed -i 's/http:\/\/security.\(.*\)/https:\/\/security.\1/g' /etc/apt/sources.list && \ sed -i s@/security.debian.org/@/mirrors.aliyun.com/@g /etc/apt/sources.list && \ sed -i s@/deb.debian.org/@/mirrors.aliyun.com/@g /etc/apt/sources.list && \ apt-get update && \ ln -s /lib /lib64 && \ apt-get install -y vim sudo net-tools lsof bash tini libc6 libpam-modules krb5-user libpam-krb5 libpam-ccreds libkrb5-dev libnss3 procps && \ mkdir -p /opt/spark && \ mkdir -p /opt/spark/examples && \ mkdir -p /opt/spark/work-dir && \ mkdir -p /opt/hadoop && \ touch /opt/spark/RELEASE && \ rm /bin/sh && \ ln -sv /bin/bash /bin/sh && \ echo "auth required pam_wheel.so use_uid" >> /etc/pam.d/su && \ chgrp root /etc/passwd && chmod ug+rw /etc/passwd && \ rm -rf /var/cache/apt/* COPY jars /opt/spark/jars COPY bin /opt/spark/bin COPY sbin /opt/spark/sbin COPY kubernetes/dockerfiles/spark/entrypoint.sh /opt/ COPY kubernetes/dockerfiles/spark/decom.sh /opt/ COPY examples /opt/spark/examples COPY kubernetes/tests /opt/spark/tests #COPY hadoop/conf /opt/hadoop/conf COPY conf /opt/spark/conf COPY data /opt/spark/data COPY hive.keytab /opt/spark/work-dir COPY run.sh /opt/spark/work-dir ENV SPARK_HOME /opt/spark WORKDIR /opt/spark/work-dir RUN chmod 777 /opt/spark/work-dir RUN chmod a+x /opt/decom.sh RUN chmod 777 /opt/spark/work-dir/run.sh ENTRYPOINT [ "/opt/entrypoint.sh" ] # Specify the User that the actual main process will run as USER ${spark_uid}
-
回到/data/spark-3.3.0路径下,执行下面的命令
#创建镜像 ./bin/docker-image-tool.sh -t v3.3.0 build #修改镜像tag docker tag spark:v3.3.0 ***.***.***.***/bigdata/spark:v3.3.0 #将镜像push到内部库中(公司内部自建) docker push ***.***.***.***/bigdata/spark:v3.3.0
创建kyuubi1.6.0镜像
-
kyuubi不需要更改配置文件,官方给了更方便的方法(kyuubi-configmap.yaml)
-
编写初始化脚本run.sh(脚本内执行了命令,但是不一定会生效,需要在拉起来的容器中查看kubectl是否可以创建pod(自行百度),"** *"表示自定义的内容)
关键:
- kubectl需要自己去网上下载,具体操作可百度
mkdir /etc/.kube chmod 777 /root/.kube cp /opt/kyuubi/config /root/.kube #kubectl可用的重要一步 echo "export KUBECONFIG=/etc/.kube/config" >> /etc/profile export KUBECONFIG=/etc/.kube/config source /etc/profile #将kubectl放入内网方便下载使用 wget http://***.***.***.***/yum/k8s/kubectl chmod +x ./kubectl mv ./kubectl /usr/bin/ #查看kubectl是否安装成功 kubectl version --client echo "***" >> /etc/hosts echo "***" > /etc/krb5.conf kinit -kt /opt/kyuubi/hive.keytab hive/***@HADOOP.****.***
-
修改/data/kyuubi-1.6.0/bin/kyuubi
关键内容:
- 在kyuubi run的位置添加
chmod 777 /opt/kyuubi/run.sh /bin/bash /opt/kyuubi/run.sh
#!/usr/bin/env bash # # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # ## Kyuubi Server Main Entrance CLASS="org.apache.kyuubi.server.KyuubiServer" function usage() { echo "Usage: bin/kyuubi command" echo " commands:" echo " start - Run a Kyuubi server as a daemon" echo " restart - Restart Kyuubi server as a daemon" echo " run - Run a Kyuubi server in the foreground" echo " stop - Stop the Kyuubi daemon" echo " status - Show status of the Kyuubi daemon" echo " -h | --help - Show this help message" } if [[ "$@" = *--help ]] || [[ "$@" = *-h ]]; then usage exit 0 fi function kyuubi_logo() { source ${KYUUBI_HOME}/bin/kyuubi-logo } function kyuubi_rotate_log() { log=$1; if [[ -z ${KYUUBI_MAX_LOG_FILES} ]]; then num=5 elif [[ ${KYUUBI_MAX_LOG_FILES} -gt 0 ]]; then num=${KYUUBI_MAX_LOG_FILES} else echo "Error: KYUUBI_MAX_LOG_FILES must be a positive number, but got ${KYUUBI_MAX_LOG_FILES}" exit -1 fi if [ -f "$log" ]; then # rotate logs while [ ${num} -gt 1 ]; do prev=`expr ${num} - 1` [ -f "$log.$prev" ] && mv "$log.$prev" "$log.$num" num=${prev} done mv "$log" "$log.$num"; fi } export KYUUBI_HOME="$(cd "$(dirname "$0")"/..; pwd)" if [[ $1 == "start" ]] || [[ $1 == "run" ]]; then . "${KYUUBI_HOME}/bin/load-kyuubi-env.sh" else . "${KYUUBI_HOME}/bin/load-kyuubi-env.sh" -s fi if [[ -z ${JAVA_HOME} ]]; then echo "Error: JAVA_HOME IS NOT SET! CANNOT PROCEED." exit 1 fi RUNNER="${JAVA_HOME}/bin/java" ## Find the Kyuubi Jar if [[ -z "$KYUUBI_JAR_DIR" ]]; then KYUUBI_JAR_DIR="$KYUUBI_HOME/jars" if [[ ! -d ${KYUUBI_JAR_DIR} ]]; then echo -e "\nCandidate Kyuubi lib $KYUUBI_JAR_DIR doesn't exist, searching development environment..." KYUUBI_JAR_DIR="$KYUUBI_HOME/kyuubi-assembly/target/scala-${KYUUBI_SCALA_VERSION}/jars" fi fi if [[ -z ${YARN_CONF_DIR} ]]; then KYUUBI_CLASSPATH="${KYUUBI_JAR_DIR}/*:${KYUUBI_CONF_DIR}:${HADOOP_CONF_DIR}" else KYUUBI_CLASSPATH="${KYUUBI_JAR_DIR}/*:${KYUUBI_CONF_DIR}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}" fi cmd="${RUNNER} ${KYUUBI_JAVA_OPTS} -cp ${KYUUBI_CLASSPATH} $CLASS" pid="${KYUUBI_PID_DIR}/kyuubi-$USER-$CLASS.pid" function start_kyuubi() { if [[ ! -w ${KYUUBI_PID_DIR} ]]; then echo "${USER} does not have 'w' permission to ${KYUUBI_PID_DIR}" exit 1 fi if [[ ! -w ${KYUUBI_LOG_DIR} ]]; then echo "${USER} does not have 'w' permission to ${KYUUBI_LOG_DIR}" exit 1 fi if [ -f "$pid" ]; then TARGET_ID="$(cat "$pid")" if [[ $(ps -p "$TARGET_ID" -o comm=) =~ "java" ]]; then echo "$CLASS running as process $TARGET_ID Stop it first." exit 1 fi fi log="${KYUUBI_LOG_DIR}/kyuubi-$USER-$CLASS-$HOSTNAME.out" kyuubi_rotate_log ${log} echo "Starting $CLASS, logging to $log" nohup nice -n "${KYUUBI_NICENESS:-0}" ${cmd} >> ${log} 2>&1 < /dev/null & newpid="$!" echo "$newpid" > "$pid" # Poll for up to 5 seconds for the java process to start for i in {1..10} do if [[ $(ps -p "$newpid" -o comm=) =~ "java" ]]; then break fi sleep 0.5 done sleep 2 # Check if the process has died; in that case we'll tail the log so the user can see if [[ ! $(ps -p "$newpid" -o comm=) =~ "java" ]]; then echo "Failed to launch: ${cmd}" tail -2 "$log" | sed 's/^/ /' echo "Full log in $log" else echo "Welcome to" kyuubi_logo fi } function run_kyuubi() { echo "Starting $CLASS" nice -n "${KYUUBI_NICENESS:-0}" ${cmd} } function stop_kyuubi() { if [ -f ${pid} ]; then TARGET_ID="$(cat "$pid")" if [[ $(ps -p "$TARGET_ID" -o comm=) =~ "java" ]]; then echo "Stopping $CLASS" kill "$TARGET_ID" && rm -f "$pid" for i in {1..20} do sleep 0.5 if [[ ! $(ps -p "$TARGET_ID" -o comm=) =~ "java" ]]; then break fi done if [[ $(ps -p "$TARGET_ID" -o comm=) =~ "java" ]]; then echo "Failed to stop kyuubi after 10 seconds, try 'kill -9 ${TARGET_ID}' forcefully " else kyuubi_logo echo "Bye!" fi else echo "no $CLASS to stop" fi else echo "no $CLASS to stop" fi } function check_kyuubi() { if [[ -f ${pid} ]]; then TARGET_ID="$(cat "$pid")" if [[ $(ps -p "$TARGET_ID" -o comm=) =~ "java" ]]; then echo "Kyuubi is running (pid: $TARGET_ID)" else echo "Kyuubi is not running" fi else echo "Kyuubi is not running" fi } case $1 in (start | "") start_kyuubi ;; (restart) echo "Restarting Kyuubi" stop_kyuubi start_kyuubi ;; (run) chmod 777 /opt/kyuubi/run.sh /bin/bash /opt/kyuubi/run.sh run_kyuubi ;; (stop) stop_kyuubi ;; (status) check_kyuubi ;; (*) usage ;; esac
- 在kyuubi run的位置添加
-
编辑/data/kyuubi-1.6.0/docker/Dockerfile
关键内容:
- 修改openjdk的源
- 修改拉取debian的源
- 安装wget vim sudo net-tools lsof bash tini libc6 libpam-modules krb5-user libpam-krb5 libpam-ccreds libkrb5-dev libnss3 procps等软件
- 复制keytab文件到/opt/kyuubi路径下
- 复制初始化脚本run.sh,用来在镜像拉起后进行修改/etc/hosts文件
- 设置user用户为0(root)(使用root,或者0都行)
# Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # # Usage: # 1. use ./build/dist to make binary distributions of Kyuubi or download a release # 2. Untar it and run the docker command below # docker build -f docker/Dockerfile -t repository/kyuubi:tagname . # Options: # -f this docker file # -t the target repo and tag name # more options can be found with -h ARG BASE_IMAGE=***.***.***.***/bigdata/openjdk:8-jre-slim ARG spark_provided="spark_builtin" FROM ${BASE_IMAGE} as builder_spark_provided ONBUILD ARG spark_home_in_docker ONBUILD ENV SPARK_HOME ${spark_home_in_docker} FROM ${BASE_IMAGE} as builder_spark_builtin ONBUILD ENV SPARK_HOME /opt/spark ONBUILD RUN mkdir -p ${SPARK_HOME} ONBUILD COPY spark-binary ${SPARK_HOME} FROM builder_${spark_provided} ARG kyuubi_uid=10009 USER root ENV KYUUBI_HOME /opt/kyuubi ENV KYUUBI_LOG_DIR ${KYUUBI_HOME}/logs ENV KYUUBI_PID_DIR ${KYUUBI_HOME}/pid ENV KYUUBI_WORK_DIR_ROOT ${KYUUBI_HOME}/work RUN set -ex && \ sed -i 's/http:\/\/deb.\(.*\)/https:\/\/deb.\1/g' /etc/apt/sources.list && \ sed -i 's/http:\/\/security.\(.*\)/https:\/\/security.\1/g' /etc/apt/sources.list && \ sed -i s@/security.debian.org/@/mirrors.aliyun.com/@g /etc/apt/sources.list && \ sed -i s@/deb.debian.org/@/mirrors.aliyun.com/@g /etc/apt/sources.list && \ apt-get update && \ apt-get install -y wget vim sudo net-tools lsof bash tini libc6 libpam-modules krb5-user libpam-krb5 libpam-ccreds libkrb5-dev libnss3 procps && \ useradd -u ${kyuubi_uid} -g root kyuubi && \ mkdir -p ${KYUUBI_HOME} ${KYUUBI_LOG_DIR} ${KYUUBI_PID_DIR} ${KYUUBI_WORK_DIR_ROOT} && \ chmod ug+rw -R ${KYUUBI_HOME} && \ chmod a+rwx -R ${KYUUBI_WORK_DIR_ROOT} && \ rm -rf /var/cache/apt/* COPY bin ${KYUUBI_HOME}/bin COPY jars ${KYUUBI_HOME}/jars COPY beeline-jars ${KYUUBI_HOME}/beeline-jars COPY externals/engines/spark ${KYUUBI_HOME}/externals/engines/spark COPY hive.keytab /opt/kyuubi COPY config /opt/kyuubi COPY run.sh /opt/kyuubi WORKDIR ${KYUUBI_HOME} CMD [ "./bin/kyuubi", "run" ] USER ${kyuubi_uid} USER root
-
回到/data/kyuubi-1.6.0路径下执行下面的命令
#创建镜像 ./bin/docker-image-tool.sh -S /opt/spark -b BASE_IMAGE=***.***.***.***/bigdata/spark:v3.3.0 -t v1.6.0 build #修改镜像tag docker tag kyuubi:v1.6.0 ***.***.***.***/bigdata/kyuubi:v1.6.0 #将镜像push到内部库中 docker push ***.***.***.***/bigdata/kyuubi:v1.6.0
2、修改kyuubi服务yaml文件
修改/kyuubi/docker/kyuubi-configmap.yaml
- 添加ns信息:namespace:
- 添加kyuubi-env.sh和kyuubi-defaults.conf配置内容
apiVersion: v1
kind: ConfigMap
metadata:
namespace: ****-bd-k8s
name: kyuubi-defaults
data:
kyuubi-env.sh: |
export SPARK_HOME=/opt/spark
export SPARK_CONF_DIR=${SPARK_HOME}/conf
export HADOOP_CONF_DIR=${SPARK_HOME}/conf:${SPARK_HOME}/conf
export KYUUBI_PID_DIR=/opt/kyuubi/pid
export KYUUBI_LOG_DIR=/opt/kyuubi/logs
export KYUUBI_WORK_DIR_ROOT=/opt/kyuubi/work
export KYUUBI_MAX_LOG_FILES=10
kyuubi-defaults.conf: |
#
## Kyuubi Configurations
#
# kyuubi.authentication NONE
# kyuubi.frontend.bind.host localhost
# kyuubi.frontend.bind.port 10009
#
# Details in https://kyuubi.apache.org/docs/latest/deployment/settings.html
kyuubi.authentication=KERBEROS
kyuubi.kinit.principal=hive/****-****-****-****@****.****.****
kyuubi.kinit.keytab=/opt/kyuubi/hive.keytab
#很重要的一个内容,避免kyuubi服务起来后,通过hostname无法链接,使用该参数表示使用ip链接
kyuubi.frontend.connection.url.use.hostname false
kyuubi.engine.share.level=USER
kyuubi.session.engine.idle.timeout=PT1H
kyuubi.ha.enabled=true
kyuubi.ha.zookeeper.quorum=***.***.***.***:2181,***.***.***.***:2181,***.***.***.***:2181
kyuubi.ha.zookeeper.namespace=kyuubi_on_k8s
spark.kubernetes.kerberos.krb5.path=/etc/krb5.conf
spark.kubernetes.trust.certificates=true
spark.kubernetes.file.upload.path=hdfs:///user/spark/k8s_upload
修改/kyuubi/docker/kyuubi-deployment.yaml
- 修改元信息:namespace
- 修改镜像信息:image
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: ****-bd-k8s
name: kyuubi-deployment-example
labels:
app: kyuubi-server
spec:
replicas: 1
selector:
matchLabels:
app: kyuubi-server
template:
metadata:
labels:
app: kyuubi-server
spec:
imagePullSecrets:
- name: harbor-pull
containers:
- name: kyuubi-server
# TODO: replace this with the stable tag
image: ***.***.***.***/bigdata/kyuubi:v1.6.0
#image: apache/kyuubi:master-snapshot
imagePullPolicy: Always
env:
- name: KYUUBI_JAVA_OPTS
value: -Dkyuubi.frontend.bind.host=0.0.0.0
ports:
- name: frontend-port
containerPort: 10009
protocol: TCP
volumeMounts:
- name: kyuubi-defaults
mountPath: /opt/kyuubi/conf
volumes:
- name: kyuubi-defaults
configMap:
name: kyuubi-defaults
#secret:
#secretName: kyuubi-defaults
修改/kyuubi/docker/kyuubi-service.yaml
- 修改元信息:namespace
apiVersion: v1
kind: Service
metadata:
namespace: ****-bd-k8s
name: kyuubi-example-service
spec:
ports:
# The default port limit is 30000-32767
# to change:
# vim kube-apiserver.yaml (usually under path: /etc/kubernetes/manifests/)
# add or change line 'service-node-port-range=1-32767' under kube-apiserver
- nodePort: 30009
# same of containerPort in pod yaml
port: 10009
protocol: TCP
type: NodePort
selector:
# same of pod label
app: kyuubi-server
3、在k8s的客户端节点运行kyuubi服务
- 运行configmap
kubectl apply -f docker/kyuubi-configmap.yaml
- 运行deployment
kubectl apply -f docker/kyuubi-deployment.yaml
- 运行svc
kubectl apply -f docker/kyuubi-service.yaml
4、使用kyuubi 客户端节点本地beeline进行连接
./bin/beeline -u 'jdbc:hive2://***.***.***.***:30009/default;principal=hive/***.***.***.***@HADOOP.****.TECH?spark.master=k8s://https://****.****.****/****/****/****;spark.submit.deployMode=cluster;spark.kubernetes.namespace=****-bd-k8s;spark.kubernetes.container.image.pullSecrets=harbor-pull;spark.kubernetes.authenticate.driver.serviceAccountName=flink;spark.kubernetes.trust.certificates=true;spark.kubernetes.executor.podNamePrefix=kyuubi-on-k8s;spark.kubernetes.container.image=***.***.***.***/bigdata/spark:v3.3.0;spark.dynamicAllocation.shuffleTracking.enabled=true;spark.dynamicAllocation.enabled=true;spark.dynamicAllocation.maxExecutors=10;spark.dynamicAllocation.minExecutors=5;spark.executor.instances=5;spark.kubernetes.kerberos.krb5.path=/etc/krb5.conf' "$@"