首页 > 其他分享 >1、Kyuubi在竞技世界大数据平台实践--Kyuubi on K8S读取kerberosed CD

1、Kyuubi在竞技世界大数据平台实践--Kyuubi on K8S读取kerberosed CD

时间:2023-02-26 21:11:08浏览次数:65  
标签:opt KYUUBI kyuubi -- kerberosed Kyuubi HOME spark SPARK

背景

  1. 为满足业务大数据架构使用多种sql引擎:spark,flink,trino(同时查询hive,clickhouse等),需要部署一个统一的sql入口,该入口满足多引擎多平台运行;
  2. 本次实践是上述需求的一个初始实践(后续部分正在进行),鉴于当前没有找到kyuubi on k8s的实践,所以记录一下;

基础环境(仅非k8s部分)

组件名称 组件版本
kyuubi v1.6.0
spark v3.3.0
CDH v6.2.1

1、创建docker镜像

创建spark3.3.0镜像

  1. 修改spark的配置文件

    1. 修改spark-env.sh文件

      添加下面的内容(路径是未来在容器中路径):

      export HADOOP_CONF_DIR=/opt/spark/conf:/opt/spark/conf
      export YARN_CONF_DIR=/opt/spark/conf:/opt/spark/conf
      
    2. spark-defaults.conf文件保持不变

  2. 将CDH中hadoop中的配置添加到/data/spark-3.3.0/conf中(CDH的hadoop配置在etc下)

  3. 编辑初始化脚本run.sh(下面内容需要补充进去,***表示自定义的内容)

    #方便后面解析CDH集群ip
    echo " ***.***.***.***  "  >> /etc/hosts
    #kerberos认证需要的配置文件
    echo "***" > /etc/krb5.conf
    #在镜像中进行认证操作
    kinit -kt /opt/spark/work-dir/hive.keytab hive/***@****.****.****
    

    编辑好run.sh放在/data/spark-3.3.0/路径下,同时keytab文件也放在该路径下(任意位置都行,但是放在这里最方便,下面的Dockerfile中需要用到)

  4. 修改docker镜像进入后的脚本文件/data/spark-3.3.0/kubernetes/dockerfiles/spark/entrypoint.sh(修改位置字体为玫红色背景)

    关键内容:

    1. 添加driver和executor运行时初始化脚本run.sh(图方便使用了777的权限)
    #!/bin/bash
    #
    # Licensed to the Apache Software Foundation (ASF) under one or more
    # contributor license agreements.  See the NOTICE file distributed with
    # this work for additional information regarding copyright ownership.
    # The ASF licenses this file to You under the Apache License, Version 2.0
    # (the "License"); you may not use this file except in compliance with
    # the License.  You may obtain a copy of the License at
    #
    #    http://www.apache.org/licenses/LICENSE-2.0
    #
    # Unless required by applicable law or agreed to in writing, software
    # distributed under the License is distributed on an "AS IS" BASIS,
    # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    # See the License for the specific language governing permissions and
    # limitations under the License.
    #
    
    # echo commands to the terminal output
    set -ex
    
    # Check whether there is a passwd entry for the container UID
    #myuid=$(id -u)
    myuid=0
    mygid=$(id -g)
    # turn off -e for getent because it will return error code in anonymous uid case
    set +e
    uidentry=$(getent passwd $myuid)
    set -e
    
    # If there is no passwd entry for the container UID, attempt to create one
    if [ -z "$uidentry" ] ; then
        if [ -w /etc/passwd ] ; then
            echo "$myuid:x:$myuid:$mygid:${SPARK_USER_NAME:-anonymous uid}:$SPARK_HOME:/bin/false" >> /etc/passwd
        else
            echo "Container ENTRYPOINT failed to add passwd entry for anonymous UID"
        fi
    fi
    
    if [ -z "$JAVA_HOME" ]; then
      JAVA_HOME=$(java -XshowSettings:properties -version 2>&1 > /dev/null | grep 'java.home' | awk '{print $3}')
    fi
    
    SPARK_CLASSPATH="$SPARK_CLASSPATH:${SPARK_HOME}/jars/*"
    env | grep SPARK_JAVA_OPT_ | sort -t_ -k4 -n | sed 's/[^=]*=\(.*\)/\1/g' > /tmp/java_opts.txt
    readarray -t SPARK_EXECUTOR_JAVA_OPTS < /tmp/java_opts.txt
    
    if [ -n "$SPARK_EXTRA_CLASSPATH" ]; then
      SPARK_CLASSPATH="$SPARK_CLASSPATH:$SPARK_EXTRA_CLASSPATH"
    fi
    
    if ! [ -z ${PYSPARK_PYTHON+x} ]; then
        export PYSPARK_PYTHON
    fi
    if ! [ -z ${PYSPARK_DRIVER_PYTHON+x} ]; then
        export PYSPARK_DRIVER_PYTHON
    fi
    
    # If HADOOP_HOME is set and SPARK_DIST_CLASSPATH is not set, set it here so Hadoop jars are available to the executor.
    # It does not set SPARK_DIST_CLASSPATH if already set, to avoid overriding customizations of this value from elsewhere e.g. Docker/K8s.
    if [ -n "${HADOOP_HOME}"  ] && [ -z "${SPARK_DIST_CLASSPATH}"  ]; then
      export SPARK_DIST_CLASSPATH="$($HADOOP_HOME/bin/hadoop classpath)"
    fi
    
    if ! [ -z ${HADOOP_CONF_DIR+x} ]; then
      SPARK_CLASSPATH="$HADOOP_CONF_DIR:$SPARK_CLASSPATH";
    fi
    
    if ! [ -z ${SPARK_CONF_DIR+x} ]; then
      SPARK_CLASSPATH="$SPARK_CONF_DIR:$SPARK_CLASSPATH";
    elif ! [ -z ${SPARK_HOME+x} ]; then
      SPARK_CLASSPATH="$SPARK_HOME/conf:$SPARK_CLASSPATH";
    fi
    
    
    
    case "$1" in
      driver)
        shift 1
        chmod 777 /opt/spark/work-dir/run.sh
        /bin/bash /opt/spark/work-dir/run.sh
        cat /etc/hosts
        CMD=(
          "$SPARK_HOME/bin/spark-submit"
          --conf "spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS"
          --deploy-mode client
          "$@"
        )
        ;;
      executor)
        shift 1
        chmod 777 /opt/spark/work-dir/run.sh
        /bin/bash /opt/spark/work-dir/run.sh
        cat /etc/hosts
        CMD=(
          ${JAVA_HOME}/bin/java
          "${SPARK_EXECUTOR_JAVA_OPTS[@]}"
          -Xms$SPARK_EXECUTOR_MEMORY
          -Xmx$SPARK_EXECUTOR_MEMORY
          -cp "$SPARK_CLASSPATH:$SPARK_DIST_CLASSPATH"
          org.apache.spark.scheduler.cluster.k8s.KubernetesExecutorBackend
          --driver-url $SPARK_DRIVER_URL
          --executor-id $SPARK_EXECUTOR_ID
          --cores $SPARK_EXECUTOR_CORES
          --app-id $SPARK_APPLICATION_ID
          --hostname $SPARK_EXECUTOR_POD_IP
          --resourceProfileId $SPARK_RESOURCE_PROFILE_ID
          --podName $SPARK_EXECUTOR_POD_NAME
        )
        ;;
    
      *)
        echo "Non-spark-on-k8s command provided, proceeding in pass-through mode..."
        CMD=("$@")
        ;;
    esac
    
    # Execute the container CMD under tini for better hygiene
    exec /usr/bin/tini -s -- "${CMD[@]}"
    
  5. 编辑/data/spark-3.3.0/kubernetes/dockerfiles/spark/Dockerfile

    关键点:

    1. 修改openjdk的源(也可以不修改,但是网络不好的话镜像拉取不下来)
    2. 修改拉取debian的源(原因同上)
    3. 安装vim sudo net-tools lsof bash tini libc6 libpam-modules krb5-user libpam-krb5 libpam-ccreds libkrb5-dev libnss3 procps等软件(方便后续在容器中进行操作)
    4. 复制cong下文件到/opt/spark/conf下
    5. 复制keytab文件到/opt/spark/work-dir路径下
    6. 复制初始化脚本run.sh,用来在镜像拉起后进行修改/etc/hosts文件
    7. 设置spark_uid为0(root)(目的是需要更改hosts文件)
    # contributor license agreements.  See the NOTICE file distributed with
    # this work for additional information regarding copyright ownership.
    # The ASF licenses this file to You under the Apache License, Version 2.0
    # (the "License"); you may not use this file except in compliance with
    # the License.  You may obtain a copy of the License at
    #
    #    http://www.apache.org/licenses/LICENSE-2.0
    #
    # Unless required by applicable law or agreed to in writing, software
    # distributed under the License is distributed on an "AS IS" BASIS,
    # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    # See the License for the specific language governing permissions and
    # limitations under the License.
    #
    ARG java_image_tag=8-jre-slim
    
    FROM ***.***.***.***/bigdata/openjdk:${java_image_tag}
    
    #ARG spark_uid=185
    ARG spark_uid=0
    
    # Before building the docker image, first build and make a Spark distribution following
    # the instructions in https://spark.apache.org/docs/latest/building-spark.html.
    # If this docker file is being used in the context of building your images from a Spark
    # distribution, the docker build command should be invoked from the top level directory
    # of the Spark distribution. E.g.:
    # docker build -t spark:latest -f kubernetes/dockerfiles/spark/Dockerfile .
    
    RUN set -ex && \
        sed -i 's/http:\/\/deb.\(.*\)/https:\/\/deb.\1/g' /etc/apt/sources.list && \
        sed -i 's/http:\/\/security.\(.*\)/https:\/\/security.\1/g' /etc/apt/sources.list && \
        sed -i s@/security.debian.org/@/mirrors.aliyun.com/@g /etc/apt/sources.list && \
        sed -i s@/deb.debian.org/@/mirrors.aliyun.com/@g /etc/apt/sources.list && \
        apt-get update && \
        ln -s /lib /lib64 && \
        apt-get install -y vim sudo net-tools lsof bash tini libc6 libpam-modules krb5-user libpam-krb5 libpam-ccreds libkrb5-dev libnss3 procps && \
        mkdir -p /opt/spark && \
        mkdir -p /opt/spark/examples && \
        mkdir -p /opt/spark/work-dir && \
        mkdir -p /opt/hadoop && \
        touch /opt/spark/RELEASE && \
        rm /bin/sh && \
        ln -sv /bin/bash /bin/sh && \
        echo "auth required pam_wheel.so use_uid" >> /etc/pam.d/su && \
        chgrp root /etc/passwd && chmod ug+rw /etc/passwd && \
        rm -rf /var/cache/apt/*
    
    COPY jars /opt/spark/jars
    COPY bin /opt/spark/bin
    COPY sbin /opt/spark/sbin
    COPY kubernetes/dockerfiles/spark/entrypoint.sh /opt/
    COPY kubernetes/dockerfiles/spark/decom.sh /opt/
    COPY examples /opt/spark/examples
    COPY kubernetes/tests /opt/spark/tests
    #COPY hadoop/conf /opt/hadoop/conf
    COPY conf /opt/spark/conf
    COPY data /opt/spark/data
    COPY hive.keytab /opt/spark/work-dir
    COPY run.sh /opt/spark/work-dir
    
    ENV SPARK_HOME /opt/spark
    
    WORKDIR /opt/spark/work-dir
    RUN chmod 777 /opt/spark/work-dir
    RUN chmod a+x /opt/decom.sh
    RUN chmod 777 /opt/spark/work-dir/run.sh
    ENTRYPOINT [ "/opt/entrypoint.sh" ]
    
    # Specify the User that the actual main process will run as
    USER ${spark_uid}
    
  6. 回到/data/spark-3.3.0路径下,执行下面的命令

    #创建镜像
    ./bin/docker-image-tool.sh -t v3.3.0 build
    #修改镜像tag
    docker tag spark:v3.3.0 ***.***.***.***/bigdata/spark:v3.3.0
    #将镜像push到内部库中(公司内部自建)
    docker push ***.***.***.***/bigdata/spark:v3.3.0
    

创建kyuubi1.6.0镜像

  1. kyuubi不需要更改配置文件,官方给了更方便的方法(kyuubi-configmap.yaml)

  2. 编写初始化脚本run.sh(脚本内执行了命令,但是不一定会生效,需要在拉起来的容器中查看kubectl是否可以创建pod(自行百度),"** *"表示自定义的内容)

    关键:

    1. kubectl需要自己去网上下载,具体操作可百度
    mkdir /etc/.kube
    chmod 777 /root/.kube
    cp /opt/kyuubi/config /root/.kube
    #kubectl可用的重要一步
    echo "export  KUBECONFIG=/etc/.kube/config" >> /etc/profile
    export  KUBECONFIG=/etc/.kube/config
    source /etc/profile
    
    #将kubectl放入内网方便下载使用
    wget http://***.***.***.***/yum/k8s/kubectl
    chmod +x ./kubectl
    mv ./kubectl /usr/bin/
    #查看kubectl是否安装成功
    kubectl version --client
    
    echo "***"  >> /etc/hosts
    
    echo "***" > /etc/krb5.conf
    
    kinit -kt /opt/kyuubi/hive.keytab hive/***@HADOOP.****.***
    
    
  3. 修改/data/kyuubi-1.6.0/bin/kyuubi

    关键内容:

    1. 在kyuubi run的位置添加
      chmod 777 /opt/kyuubi/run.sh
      /bin/bash /opt/kyuubi/run.sh
      
    #!/usr/bin/env bash
    #
    # Licensed to the Apache Software Foundation (ASF) under one or more
    # contributor license agreements.  See the NOTICE file distributed with
    # this work for additional information regarding copyright ownership.
    # The ASF licenses this file to You under the Apache License, Version 2.0
    # (the "License"); you may not use this file except in compliance with
    # the License.  You may obtain a copy of the License at
    #
    #    http://www.apache.org/licenses/LICENSE-2.0
    #
    # Unless required by applicable law or agreed to in writing, software
    # distributed under the License is distributed on an "AS IS" BASIS,
    # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    # See the License for the specific language governing permissions and
    # limitations under the License.
    #
    
    ## Kyuubi Server Main Entrance
    CLASS="org.apache.kyuubi.server.KyuubiServer"
    
    function usage() {
      echo "Usage: bin/kyuubi command"
      echo "  commands:"
      echo "    start        - Run a Kyuubi server as a daemon"
      echo "    restart      - Restart Kyuubi server as a daemon"
      echo "    run          - Run a Kyuubi server in the foreground"
      echo "    stop         - Stop the Kyuubi daemon"
      echo "    status       - Show status of the Kyuubi daemon"
      echo "    -h | --help  - Show this help message"
    }
    
    if [[ "$@" = *--help ]] || [[ "$@" = *-h ]]; then
      usage
      exit 0
    fi
    
    function kyuubi_logo() {
      source ${KYUUBI_HOME}/bin/kyuubi-logo
    }
    
    function kyuubi_rotate_log() {
      log=$1;
    
      if [[ -z ${KYUUBI_MAX_LOG_FILES} ]]; then
        num=5
      elif [[ ${KYUUBI_MAX_LOG_FILES} -gt 0 ]]; then
        num=${KYUUBI_MAX_LOG_FILES}
      else
        echo "Error: KYUUBI_MAX_LOG_FILES must be a positive number, but got ${KYUUBI_MAX_LOG_FILES}"
        exit -1
      fi
    
      if [ -f "$log" ]; then # rotate logs
      while [ ${num} -gt 1 ]; do
        prev=`expr ${num} - 1`
        [ -f "$log.$prev" ] && mv "$log.$prev" "$log.$num"
        num=${prev}
      done
      mv "$log" "$log.$num";
      fi
    }
    
    export KYUUBI_HOME="$(cd "$(dirname "$0")"/..; pwd)"
    
    if [[ $1 == "start" ]] || [[ $1 == "run" ]]; then
      . "${KYUUBI_HOME}/bin/load-kyuubi-env.sh"
    else
      . "${KYUUBI_HOME}/bin/load-kyuubi-env.sh" -s
    fi
    
    if [[ -z ${JAVA_HOME} ]]; then
      echo "Error: JAVA_HOME IS NOT SET! CANNOT PROCEED."
      exit 1
    fi
    
    RUNNER="${JAVA_HOME}/bin/java"
    
    ## Find the Kyuubi Jar
    if [[ -z "$KYUUBI_JAR_DIR" ]]; then
      KYUUBI_JAR_DIR="$KYUUBI_HOME/jars"
      if [[ ! -d ${KYUUBI_JAR_DIR} ]]; then
      echo -e "\nCandidate Kyuubi lib $KYUUBI_JAR_DIR doesn't exist, searching development environment..."
        KYUUBI_JAR_DIR="$KYUUBI_HOME/kyuubi-assembly/target/scala-${KYUUBI_SCALA_VERSION}/jars"
      fi
    fi
    
    if [[ -z ${YARN_CONF_DIR} ]]; then
      KYUUBI_CLASSPATH="${KYUUBI_JAR_DIR}/*:${KYUUBI_CONF_DIR}:${HADOOP_CONF_DIR}"
    else
      KYUUBI_CLASSPATH="${KYUUBI_JAR_DIR}/*:${KYUUBI_CONF_DIR}:${HADOOP_CONF_DIR}:${YARN_CONF_DIR}"
    fi
    
    cmd="${RUNNER} ${KYUUBI_JAVA_OPTS} -cp ${KYUUBI_CLASSPATH} $CLASS"
    
    pid="${KYUUBI_PID_DIR}/kyuubi-$USER-$CLASS.pid"
    
    function start_kyuubi() {
      if [[ ! -w ${KYUUBI_PID_DIR} ]]; then
        echo "${USER} does not have 'w' permission to ${KYUUBI_PID_DIR}"
        exit 1
      fi
    
      if [[ ! -w ${KYUUBI_LOG_DIR} ]]; then
        echo "${USER} does not have 'w' permission to ${KYUUBI_LOG_DIR}"
        exit 1
      fi
    
      if [ -f "$pid" ]; then
        TARGET_ID="$(cat "$pid")"
        if [[ $(ps -p "$TARGET_ID" -o comm=) =~ "java" ]]; then
          echo "$CLASS running as process $TARGET_ID  Stop it first."
          exit 1
        fi
      fi
    
      log="${KYUUBI_LOG_DIR}/kyuubi-$USER-$CLASS-$HOSTNAME.out"
      kyuubi_rotate_log ${log}
    
      echo "Starting $CLASS, logging to $log"
      nohup nice -n "${KYUUBI_NICENESS:-0}" ${cmd} >> ${log} 2>&1 < /dev/null &
      newpid="$!"
    
      echo "$newpid" > "$pid"
    
      # Poll for up to 5 seconds for the java process to start
      for i in {1..10}
      do
        if [[ $(ps -p "$newpid" -o comm=) =~ "java" ]]; then
           break
        fi
        sleep 0.5
      done
    
      sleep 2
      # Check if the process has died; in that case we'll tail the log so the user can see
      if [[ ! $(ps -p "$newpid" -o comm=) =~ "java" ]]; then
        echo "Failed to launch: ${cmd}"
        tail -2 "$log" | sed 's/^/  /'
        echo "Full log in $log"
      else
        echo "Welcome to"
        kyuubi_logo
      fi
    }
    
    function run_kyuubi() {
      echo "Starting $CLASS"
      nice -n "${KYUUBI_NICENESS:-0}" ${cmd}
    }
    
    function stop_kyuubi() {
      if [ -f ${pid} ]; then
        TARGET_ID="$(cat "$pid")"
        if [[ $(ps -p "$TARGET_ID" -o comm=) =~ "java" ]]; then
          echo "Stopping $CLASS"
          kill "$TARGET_ID" && rm -f "$pid"
          for i in {1..20}
          do
            sleep 0.5
            if [[ ! $(ps -p "$TARGET_ID" -o comm=) =~ "java" ]]; then
              break
            fi
          done
    
          if [[ $(ps -p "$TARGET_ID" -o comm=) =~ "java" ]]; then
            echo "Failed to stop kyuubi after 10 seconds, try 'kill -9 ${TARGET_ID}' forcefully "
          else
            kyuubi_logo
            echo "Bye!"
          fi
        else
          echo "no $CLASS to stop"
        fi
      else
        echo "no $CLASS to stop"
      fi
    }
    
    function check_kyuubi() {
      if [[ -f ${pid} ]]; then
        TARGET_ID="$(cat "$pid")"
        if [[ $(ps -p "$TARGET_ID" -o comm=) =~ "java" ]]; then
          echo "Kyuubi is running (pid: $TARGET_ID)"
        else
          echo "Kyuubi is not running"
        fi
      else
        echo "Kyuubi is not running"
      fi
    }
    
    case $1 in
      (start | "")
        start_kyuubi
        ;;
    
      (restart)
        echo "Restarting Kyuubi"
        stop_kyuubi
        start_kyuubi
        ;;
    
      (run)
        chmod 777 /opt/kyuubi/run.sh
        /bin/bash /opt/kyuubi/run.sh
        run_kyuubi
        ;;
    
      (stop)
        stop_kyuubi
        ;;
    
      (status)
        check_kyuubi
        ;;
    
      (*)
        usage
        ;;
    esac
    
  4. 编辑/data/kyuubi-1.6.0/docker/Dockerfile

    关键内容:

    1. 修改openjdk的源
    2. 修改拉取debian的源
    3. 安装wget vim sudo net-tools lsof bash tini libc6 libpam-modules krb5-user libpam-krb5 libpam-ccreds libkrb5-dev libnss3 procps等软件
    4. 复制keytab文件到/opt/kyuubi路径下
    5. 复制初始化脚本run.sh,用来在镜像拉起后进行修改/etc/hosts文件
    6. 设置user用户为0(root)(使用root,或者0都行)
    # Licensed to the Apache Software Foundation (ASF) under one or more
    # contributor license agreements.  See the NOTICE file distributed with
    # this work for additional information regarding copyright ownership.
    # The ASF licenses this file to You under the Apache License, Version 2.0
    # (the "License"); you may not use this file except in compliance with
    # the License.  You may obtain a copy of the License at
    #
    #    http://www.apache.org/licenses/LICENSE-2.0
    #
    # Unless required by applicable law or agreed to in writing, software
    # distributed under the License is distributed on an "AS IS" BASIS,
    # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    # See the License for the specific language governing permissions and
    # limitations under the License.
    #
    
    # Usage:
    #   1. use ./build/dist to make binary distributions of Kyuubi or download a release
    #   2. Untar it and run the docker command below
    #      docker build -f docker/Dockerfile -t repository/kyuubi:tagname .
    #   Options:
    #     -f this docker file
    #     -t the target repo and tag name
    #     more options can be found with -h
    
    ARG BASE_IMAGE=***.***.***.***/bigdata/openjdk:8-jre-slim
    ARG spark_provided="spark_builtin"
    
    FROM ${BASE_IMAGE} as builder_spark_provided
    ONBUILD ARG spark_home_in_docker
    ONBUILD ENV SPARK_HOME ${spark_home_in_docker}
    
    FROM ${BASE_IMAGE} as builder_spark_builtin
    
    ONBUILD ENV SPARK_HOME /opt/spark
    ONBUILD RUN mkdir -p  ${SPARK_HOME}
    ONBUILD COPY spark-binary ${SPARK_HOME}
    
    FROM builder_${spark_provided}
    
    ARG kyuubi_uid=10009
    USER root
    
    ENV KYUUBI_HOME /opt/kyuubi
    ENV KYUUBI_LOG_DIR ${KYUUBI_HOME}/logs
    ENV KYUUBI_PID_DIR ${KYUUBI_HOME}/pid
    ENV KYUUBI_WORK_DIR_ROOT ${KYUUBI_HOME}/work
    
    RUN set -ex && \
        sed -i 's/http:\/\/deb.\(.*\)/https:\/\/deb.\1/g' /etc/apt/sources.list && \
        sed -i 's/http:\/\/security.\(.*\)/https:\/\/security.\1/g' /etc/apt/sources.list && \
        sed -i s@/security.debian.org/@/mirrors.aliyun.com/@g /etc/apt/sources.list && \
        sed -i s@/deb.debian.org/@/mirrors.aliyun.com/@g /etc/apt/sources.list && \
        apt-get update && \
        apt-get install -y wget vim sudo net-tools lsof bash tini libc6 libpam-modules krb5-user libpam-krb5 libpam-ccreds libkrb5-dev libnss3 procps && \
        useradd -u ${kyuubi_uid} -g root kyuubi && \
        mkdir -p ${KYUUBI_HOME} ${KYUUBI_LOG_DIR} ${KYUUBI_PID_DIR} ${KYUUBI_WORK_DIR_ROOT} && \
        chmod ug+rw -R ${KYUUBI_HOME} && \
        chmod a+rwx -R ${KYUUBI_WORK_DIR_ROOT} && \
        rm -rf /var/cache/apt/*
    
    COPY bin ${KYUUBI_HOME}/bin
    COPY jars ${KYUUBI_HOME}/jars
    COPY beeline-jars ${KYUUBI_HOME}/beeline-jars
    COPY externals/engines/spark ${KYUUBI_HOME}/externals/engines/spark
    COPY hive.keytab /opt/kyuubi
    COPY config /opt/kyuubi
    COPY run.sh /opt/kyuubi
    
    
    WORKDIR ${KYUUBI_HOME}
    
    CMD [ "./bin/kyuubi", "run" ]
    
    USER ${kyuubi_uid}
    
    
    USER root
    
  5. 回到/data/kyuubi-1.6.0路径下执行下面的命令

    #创建镜像
    ./bin/docker-image-tool.sh -S /opt/spark -b BASE_IMAGE=***.***.***.***/bigdata/spark:v3.3.0 -t v1.6.0  build
    #修改镜像tag
    docker tag kyuubi:v1.6.0 ***.***.***.***/bigdata/kyuubi:v1.6.0
    #将镜像push到内部库中
    docker push ***.***.***.***/bigdata/kyuubi:v1.6.0
    

2、修改kyuubi服务yaml文件

修改/kyuubi/docker/kyuubi-configmap.yaml

  1. 添加ns信息:namespace:
  2. 添加kyuubi-env.sh和kyuubi-defaults.conf配置内容
apiVersion: v1
kind: ConfigMap
metadata:
  namespace: ****-bd-k8s
  name: kyuubi-defaults
data:
  kyuubi-env.sh: |
     export SPARK_HOME=/opt/spark
     export SPARK_CONF_DIR=${SPARK_HOME}/conf
     export HADOOP_CONF_DIR=${SPARK_HOME}/conf:${SPARK_HOME}/conf

     export KYUUBI_PID_DIR=/opt/kyuubi/pid
     export KYUUBI_LOG_DIR=/opt/kyuubi/logs
     export KYUUBI_WORK_DIR_ROOT=/opt/kyuubi/work
     export KYUUBI_MAX_LOG_FILES=10
  kyuubi-defaults.conf: |
    #  
    ## Kyuubi Configurations
             
    # 
    # kyuubi.authentication           NONE
    # kyuubi.frontend.bind.host       localhost
    # kyuubi.frontend.bind.port       10009
    #

    # Details in https://kyuubi.apache.org/docs/latest/deployment/settings.html
      kyuubi.authentication=KERBEROS
      kyuubi.kinit.principal=hive/****-****-****-****@****.****.****
      kyuubi.kinit.keytab=/opt/kyuubi/hive.keytab
      
      #很重要的一个内容,避免kyuubi服务起来后,通过hostname无法链接,使用该参数表示使用ip链接
      kyuubi.frontend.connection.url.use.hostname false

      kyuubi.engine.share.level=USER
      kyuubi.session.engine.idle.timeout=PT1H

      kyuubi.ha.enabled=true
      kyuubi.ha.zookeeper.quorum=***.***.***.***:2181,***.***.***.***:2181,***.***.***.***:2181
      kyuubi.ha.zookeeper.namespace=kyuubi_on_k8s

      spark.kubernetes.kerberos.krb5.path=/etc/krb5.conf

      spark.kubernetes.trust.certificates=true

      spark.kubernetes.file.upload.path=hdfs:///user/spark/k8s_upload

修改/kyuubi/docker/kyuubi-deployment.yaml

  1. 修改元信息:namespace
  2. 修改镜像信息:image
apiVersion: apps/v1
kind: Deployment
metadata:
  namespace: ****-bd-k8s
  name: kyuubi-deployment-example
  labels:
    app: kyuubi-server
spec:
  replicas: 1
  selector:
    matchLabels:
      app: kyuubi-server
  template:
    metadata:
      labels:
        app: kyuubi-server
    spec:
      imagePullSecrets:
        - name: harbor-pull
      containers:
        - name: kyuubi-server
          # TODO: replace this with the stable tag
          image: ***.***.***.***/bigdata/kyuubi:v1.6.0
          #image: apache/kyuubi:master-snapshot
          imagePullPolicy: Always
          env:
            - name: KYUUBI_JAVA_OPTS
              value: -Dkyuubi.frontend.bind.host=0.0.0.0
          ports:
            - name: frontend-port
              containerPort: 10009
              protocol: TCP
          volumeMounts:
            - name: kyuubi-defaults
              mountPath: /opt/kyuubi/conf
      volumes:
        - name: kyuubi-defaults
          configMap:
            name: kyuubi-defaults
          #secret:      
            #secretName: kyuubi-defaults

修改/kyuubi/docker/kyuubi-service.yaml

  1. 修改元信息:namespace
apiVersion: v1
kind: Service
metadata:
  namespace: ****-bd-k8s
  name: kyuubi-example-service
spec:
  ports:
    # The default port limit is 30000-32767
    # to change:
    #   vim kube-apiserver.yaml (usually under path: /etc/kubernetes/manifests/)
    #   add or change line 'service-node-port-range=1-32767' under kube-apiserver
    - nodePort: 30009
      # same of containerPort in pod yaml
      port: 10009
      protocol: TCP
  type: NodePort
  selector:
    # same of pod label
    app: kyuubi-server

3、在k8s的客户端节点运行kyuubi服务

  1. 运行configmap
    kubectl apply -f docker/kyuubi-configmap.yaml
    
  2. 运行deployment
    kubectl apply -f docker/kyuubi-deployment.yaml
    
  3. 运行svc
    kubectl apply -f docker/kyuubi-service.yaml
    

4、使用kyuubi 客户端节点本地beeline进行连接

./bin/beeline -u 'jdbc:hive2://***.***.***.***:30009/default;principal=hive/***.***.***.***@HADOOP.****.TECH?spark.master=k8s://https://****.****.****/****/****/****;spark.submit.deployMode=cluster;spark.kubernetes.namespace=****-bd-k8s;spark.kubernetes.container.image.pullSecrets=harbor-pull;spark.kubernetes.authenticate.driver.serviceAccountName=flink;spark.kubernetes.trust.certificates=true;spark.kubernetes.executor.podNamePrefix=kyuubi-on-k8s;spark.kubernetes.container.image=***.***.***.***/bigdata/spark:v3.3.0;spark.dynamicAllocation.shuffleTracking.enabled=true;spark.dynamicAllocation.enabled=true;spark.dynamicAllocation.maxExecutors=10;spark.dynamicAllocation.minExecutors=5;spark.executor.instances=5;spark.kubernetes.kerberos.krb5.path=/etc/krb5.conf' "$@"

5、效果展示

https://mp.weixin.qq.com/s?__biz=Mzg2MDA5MTU4OA==&mid=2247497852&idx=1&sn=8facac617e9c101c653beafdd58a79bb&chksm=ce291dd7f95e94c1f79f4dd93adaa8182432cbf7775d37231b1deb9726ada4dd2e037269afc2&scene=178&cur_album_id=2087652178063622145#rd

标签:opt,KYUUBI,kyuubi,--,kerberosed,Kyuubi,HOME,spark,SPARK
From: https://www.cnblogs.com/HYBG/p/17157706.html

相关文章

  • Codeforces Global Round 15 CF1552 A~G 题解
    点我看题对两三年前的cf进行考古的时候偶然做到这场,像这种全是构造题和思维题的比赛还是比较少见的。题目本身很有意思,但是对于现场打比赛想上分的人来说体验就比较差了。......
  • javascript捕捉键盘
    1<!DOCTYPEhtml>2<htmllang="en">34<head>5<metacharset="UTF-8">6<metahttp-equiv="X-UA-Compatible"content="IE=edge">7<metan......
  • ACP云原生容器题目整理 -- 容器综述
    Pivotal定义的云原生四要素:DevOps,持续交付,容器,微服务CNCF对云原生定义:容器,服务网格,微服务,不可变基础设施,声明式API登陆进入容器内部的方式:dockerexec,dockerattach,ssh,n......
  • python 正则表达式
    importreprint("*"*20)#re.match在字符串开头进行匹配pattern="(\d+)(@)"string="123456@qq.com"result=re.match(pattern,string)print(result.group())#......
  • JavaScript 赋值运算符
    <!DOCTYPEhtml><html> <head> <metacharset="UTF-8"> <title></title> <scripttype="text/javascript"> /* *= * 可以将符号右侧的值赋值给符号左......
  • # Java面向对象部分重点笔记
    Java面向对象部分重点笔记 类的定义在类中,属性是通过成员变量体现的,而行为是成员函数(又称为方法)实现的,下面为大家演示Java中定义类的通用格式,其语法格式如下: 对......
  • Java学习day2
    Java学习day2  JavaDocjavadoc命令用来生成自己的API文档@author作者名@version版本号@since指明需要最早使用的jdk版本@param参数名@return返......
  • 6.10-微程序控制器
    微程序控制器基本思想硬布线:同步逻辑,繁,快,贵,难改1)一条指令多个时钟周期2)一个时钟周期一个状态3)一个状态对应一组并发信号4)如果需要新增一条指令,这些所有的状态机,以及......
  • Java学习Day1
    Java学习-----day1 Java的三大版本:JavaSE:标准版(桌面程序,控制台开发...)JavaME:嵌入式开发(手机,小家电...)JavaEE:企业集开发(web端,服务器开发...)......
  • JavaScript 关系运算符
    <!DOCTYPEhtml><html> <head> <metacharset="UTF-8"> <title></title> <scripttype="text/javascript"> /* *通过关系运算符可以比较两个值之间的......