Spark 源码 | 脚本分析总结

前言

最初是想学习一下Spark提交流程的源码,比如 Spark On Yarn 、Standalone。之前只是通过网上总结的文章大概了解整体的提交流程,但是每个文章描述的又不太一样,弄不清楚到底哪个说的准确,比如Client 和 CLuster 模式的区别,Driver到底是干啥的,是如何定义的,为了彻底弄清楚这些疑问,所以决定学习一下相关的源码。因为不管是服务启动还是应用程序启动,都是通过脚本提交的,所以我们先从分析脚本开始。

版本

Spark 3.2.3

Spark 脚本

先看一下Spark 主要的脚本有哪些:spark-submit、spark-sql、spark-shell、spark-class、start-all.sh、stop-all.sh、start-master.sh、start-workers.sh 等。

spark-sql

if [ -z "${SPARK_HOME}" ]; then
  source "$(dirname "$0")"/find-spark-home
fi

export _SPARK_CMD_USAGE="Usage: ./bin/spark-sql [options] [cli option]"
exec "${SPARK_HOME}"/bin/spark-submit --class org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver "$@"

通过 spark-submit 提交类 org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver

spark-shell

# Shell script for starting the Spark Shell REPL

cygwin=false
case "$(uname)" in
  CYGWIN*) cygwin=true;;
esac

# Enter posix mode for bash
set -o posix

if [ -z "${SPARK_HOME}" ]; then
  source "$(dirname "$0")"/find-spark-home
fi

export _SPARK_CMD_USAGE="Usage: ./bin/spark-shell [options]

Scala REPL options:
  -I <file>                   preload <file>, enforcing line-by-line interpretation"

# SPARK-4161: scala does not assume use of the java classpath,
# so we need to add the "-Dscala.usejavacp=true" flag manually. We
# do this specifically for the Spark shell because the scala REPL
# has its own class loader, and any additional classpath specified
# through spark.driver.extraClassPath is not automatically propagated.
SPARK_SUBMIT_OPTS="$SPARK_SUBMIT_OPTS -Dscala.usejavacp=true"

function main() {
  if $cygwin; then
    # Workaround for issue involving JLine and Cygwin
    # (see http://sourceforge.net/p/jline/bugs/40/).
    # If you're using the Mintty terminal emulator in Cygwin, may need to set the
    # "Backspace sends ^H" setting in "Keys" section of the Mintty options
    # (see https://github.com/sbt/sbt/issues/562).
    stty -icanon min 1 -echo > /dev/null 2>&1
    export SPARK_SUBMIT_OPTS="$SPARK_SUBMIT_OPTS -Djline.terminal=unix"
    "${SPARK_HOME}"/bin/spark-submit --class org.apache.spark.repl.Main --name "Spark shell" "$@"
    stty icanon echo > /dev/null 2>&1
  else
    export SPARK_SUBMIT_OPTS
    "${SPARK_HOME}"/bin/spark-submit --class org.apache.spark.repl.Main --name "Spark shell" "$@"
  fi
}

# Copy restore-TTY-on-exit functions from Scala script so spark-shell exits properly even in
# binary distribution of Spark where Scala is not installed
exit_status=127
saved_stty=""

# restore stty settings (echo in particular)
function restoreSttySettings() {
  stty $saved_stty
  saved_stty=""
}

function onExit() {
  if [[ "$saved_stty" != "" ]]; then
    restoreSttySettings
  fi
  exit $exit_status
}

# to reenable echo if we are interrupted before completing.
trap onExit INT

# save terminal settings
saved_stty=$(stty -g 2>/dev/null)
# clear on error so we don't later try to restore them
if [[ ! $? ]]; then
  saved_stty=""
fi

main "$@"

# record the exit status lest it be overwritten:
# then reenable echo and propagate the code.
exit_status=$?
onExit

这里的主要逻辑也是用 spark-submit 提交类 org.apache.spark.repl.Main

spark-submit

根据上面的分析:spark-sql 和 spark-shell 两个交互式命令行脚本都是通过 spark-submit --class ClassName 来实现的。

if [ -z "${SPARK_HOME}" ]; then
  source "$(dirname "$0")"/find-spark-home
fi

# disable randomized hash for string in Python 3.3+
export PYTHONHASHSEED=0

exec "${SPARK_HOME}"/bin/spark-class org.apache.spark.deploy.SparkSubmit "$@"

逻辑比较清晰:通过 spark-class 提交 org.apache.spark.deploy.SparkSubmit
具体到 spark-sql 和 spark-shell 分别为:

/opt/dkl/spark-3.2.3-bin-hadoop3.2/bin/spark-class org.apache.spark.deploy.SparkSubmit --class org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver
/opt/dkl/spark-3.2.3-bin-hadoop3.2/bin/spark-class org.apache.spark.deploy.SparkSubmit --class org.apache.spark.repl.Main --name Spark shell

start-all.sh

功能:启动 standalone 所有服务。相关配置可参考 Spark Standalone 集群配置

# Start all spark daemons.
# Starts the master on this node.
# Starts a worker on each node specified in conf/workers

if [ -z "${SPARK_HOME}" ]; then
  export SPARK_HOME="$(cd "`dirname "$0"`"/..; pwd)"
fi

# Load the Spark configuration
. "${SPARK_HOME}/sbin/spark-config.sh"

# Start Master
"${SPARK_HOME}/sbin"/start-master.sh

# Start Workers
"${SPARK_HOME}/sbin"/start-workers.sh

主要逻辑:start-master.sh 启动 master、start-workers.sh 启动所有worker

start-master.sh

# Starts the master on the machine this script is executed on.

if [ -z "${SPARK_HOME}" ]; then
  export SPARK_HOME="$(cd "`dirname "$0"`"/..; pwd)"
fi

# NOTE: This exact class name is matched downstream by SparkSubmit.
# Any changes need to be reflected there.
CLASS="org.apache.spark.deploy.master.Master"

if [[ "$@" = *--help ]] || [[ "$@" = *-h ]]; then
  echo "Usage: ./sbin/start-master.sh [options]"
  pattern="Usage:"
  pattern+="\|Using Spark's default log4j profile:"
  pattern+="\|Started daemon with process name"
  pattern+="\|Registered signal handler for"

  "${SPARK_HOME}"/bin/spark-class $CLASS --help 2>&1 | grep -v "$pattern" 1>&2
  exit 1
fi

ORIGINAL_ARGS="$@"

. "${SPARK_HOME}/sbin/spark-config.sh"

. "${SPARK_HOME}/bin/load-spark-env.sh"

if [ "$SPARK_MASTER_PORT" = "" ]; then
  SPARK_MASTER_PORT=7077
fi

if [ "$SPARK_MASTER_HOST" = "" ]; then
  case `uname` in
      (SunOS)
	  SPARK_MASTER_HOST="`/usr/sbin/check-hostname | awk '{print $NF}'`"
	  ;;
      (*)
	  SPARK_MASTER_HOST="`hostname -f`"
	  ;;
  esac
fi

if [ "$SPARK_MASTER_WEBUI_PORT" = "" ]; then
  SPARK_MASTER_WEBUI_PORT=8080
fi

"${SPARK_HOME}/sbin"/spark-daemon.sh start $CLASS 1 \
  --host $SPARK_MASTER_HOST --port $SPARK_MASTER_PORT --webui-port $SPARK_MASTER_WEBUI_PORT \
  $ORIGINAL_ARGS

主要逻辑:由 spark-daemon.sh 启动类 org.apache.spark.deploy.master.Master

具体的参数:/opt/dkl/spark-3.2.3-bin-hadoop3.2/sbin/spark-daemon.sh start org.apache.spark.deploy.master.Master 1 --host indata-10-110-8-199.indata.com --port 7077 --webui-port 8080

start-workers.sh

if [ -z "${SPARK_HOME}" ]; then
  export SPARK_HOME="$(cd "`dirname "$0"`"/..; pwd)"
fi

. "${SPARK_HOME}/sbin/spark-config.sh"
. "${SPARK_HOME}/bin/load-spark-env.sh"

# Find the port number for the master
if [ "$SPARK_MASTER_PORT" = "" ]; then
  SPARK_MASTER_PORT=7077
fi

if [ "$SPARK_MASTER_HOST" = "" ]; then
  case `uname` in
      (SunOS)
          SPARK_MASTER_HOST="`/usr/sbin/check-hostname | awk '{print $NF}'`"
          ;;
      (*)
          SPARK_MASTER_HOST="`hostname -f`"
          ;;
  esac
fi

# Launch the workers
"${SPARK_HOME}/sbin/workers.sh" cd "${SPARK_HOME}" \; "${SPARK_HOME}/sbin/start-worker.sh" "spark://$SPARK_MASTER_HOST:$SPARK_MASTER_PORT"

配置host和端口,然后调用 workers.sh 参数是 cd "${SPARK_HOME}" ; "${SPARK_HOME}/sbin/start-worker.sh" “spark://$SPARK_MASTER_HOST:$SPARK_MASTER_PORT
具体的参数:cd /opt/dkl/spark-3.2.3-bin-hadoop3.2 ; /opt/dkl/spark-3.2.3-bin-hadoop3.2/sbin/start-worker.sh spark://indata-10-110-8-199.indata.com:7077

workers.sh

# Run a shell command on all worker hosts.
#
# Environment Variables
#
#   SPARK_WORKERS    File naming remote hosts.
#     Default is ${SPARK_CONF_DIR}/workers.
#   SPARK_CONF_DIR  Alternate conf dir. Default is ${SPARK_HOME}/conf.
#   SPARK_WORKER_SLEEP Seconds to sleep between spawning remote commands.
#   SPARK_SSH_OPTS Options passed to ssh when running remote commands.
##

usage="Usage: workers.sh [--config <conf-dir>] command..."

# if no args specified, show usage
if [ $# -le 0 ]; then
  echo $usage
  exit 1
fi

if [ -z "${SPARK_HOME}" ]; then
  export SPARK_HOME="$(cd "`dirname "$0"`"/..; pwd)"
fi

. "${SPARK_HOME}/sbin/spark-config.sh"

# If the workers file is specified in the command line,
# then it takes precedence over the definition in
# spark-env.sh. Save it here.
if [ -f "$SPARK_WORKERS" ]; then
  HOSTLIST=`cat "$SPARK_WORKERS"`
fi
if [ -f "$SPARK_SLAVES" ]; then
  >&2 echo "SPARK_SLAVES is deprecated, use SPARK_WORKERS"
  HOSTLIST=`cat "$SPARK_SLAVES"`
fi


# Check if --config is passed as an argument. It is an optional parameter.
# Exit if the argument is not a directory.
if [ "$1" == "--config" ]
then
  shift
  conf_dir="$1"
  if [ ! -d "$conf_dir" ]
  then
    echo "ERROR : $conf_dir is not a directory"
    echo $usage
    exit 1
  else
    export SPARK_CONF_DIR="$conf_dir"
  fi
  shift
fi

. "${SPARK_HOME}/bin/load-spark-env.sh"

if [ "$HOSTLIST" = "" ]; then
  if [ "$SPARK_SLAVES" = "" ] && [ "$SPARK_WORKERS" = "" ]; then
    if [ -f "${SPARK_CONF_DIR}/workers" ]; then
      HOSTLIST=`cat "${SPARK_CONF_DIR}/workers"`
    elif [ -f "${SPARK_CONF_DIR}/slaves" ]; then
      HOSTLIST=`cat "${SPARK_CONF_DIR}/slaves"`
    else
      HOSTLIST=localhost
    fi
  else
    if [ -f "$SPARK_WORKERS" ]; then
      HOSTLIST=`cat "$SPARK_WORKERS"`
    fi
    if [ -f "$SPARK_SLAVES" ]; then
      >&2 echo "SPARK_SLAVES is deprecated, use SPARK_WORKERS"
      HOSTLIST=`cat "$SPARK_SLAVES"`
    fi
  fi
fi



# By default disable strict host key checking
if [ "$SPARK_SSH_OPTS" = "" ]; then
  SPARK_SSH_OPTS="-o StrictHostKeyChecking=no"
fi
# 这里通过sed 将host # 后面的删除
# 遍历 HOSTLIST ,ssh 到每个host节点,执行 start-workers.sh 中的参数
# 备注:$@:传递给脚本或函数的所有参数
for host in `echo "$HOSTLIST"|sed  "s/#.*$//;/^$/d"`; do
  if [ -n "${SPARK_SSH_FOREGROUND}" ]; then
    ssh $SPARK_SSH_OPTS "$host" $"${@// /\\ }" \
      2>&1 | sed "s/^/$host: /"
  else
    ssh $SPARK_SSH_OPTS "$host" $"${@// /\\ }" \
      2>&1 | sed "s/^/$host: /" &
  fi
  if [ "$SPARK_WORKER_SLEEP" != "" ]; then
    sleep $SPARK_WORKER_SLEEP
  fi
  if [ "$SPARK_SLAVE_SLEEP" != "" ]; then
    >&2 echo "SPARK_SLAVE_SLEEP is deprecated, use SPARK_WORKER_SLEEP"
    sleep $SPARK_SLAVE_SLEEP
  fi
done

wait

主要逻辑:

  1. 先获取 HOSTLIST,优先级 $SPARK_WORKERS$SPARK_SLAVES${SPARK_CONF_DIR}/workers、${SPARK_CONF_DIR}/slaves,一般我们在 conf/workers (Spark3 默认) 或者 conf/slaves (Spark2 默认) 里配置 worker的 ip 或者hostname,如果没有配置,则默认 localhost
  2. 获取 SPARK_SSH_OPTS ,默认 “-o StrictHostKeyChecking=no” ,如果有特殊需求,如端口号不是默认的 22,则可以在 spark-env.sh 中添加 export SPARK_SSH_OPTS=“-p 6233 -o StrictHostKeyChecking=no”
  3. 遍历 HOSTLIST , ssh 到每个host节点,执行上面 start-workers.sh 中的参数 cd “${SPARK_HOME}” ; “${SPARK_HOME}/sbin/start-worker.sh” “spark://$SPARK_MASTER_HOST:$SPARK_MASTER_PORT”。备注:$@:传递给脚本或函数的所有参数

start-worker.sh

# Starts a worker on the machine this script is executed on.
#
# Environment Variables
#
#   SPARK_WORKER_INSTANCES  The number of worker instances to run on this
#                           worker.  Default is 1. Note it has been deprecate since Spark 3.0.
#   SPARK_WORKER_PORT       The base port number for the first worker. If set,
#                           subsequent workers will increment this number.  If
#                           unset, Spark will find a valid port number, but
#                           with no guarantee of a predictable pattern.
#   SPARK_WORKER_WEBUI_PORT The base port for the web interface of the first
#                           worker.  Subsequent workers will increment this
#                           number.  Default is 8081.

if [ -z "${SPARK_HOME}" ]; then
  export SPARK_HOME="$(cd "`dirname "$0"`"/..; pwd)"
fi

# NOTE: This exact class name is matched downstream by SparkSubmit.
# Any changes need to be reflected there.
CLASS="org.apache.spark.deploy.worker.Worker"

if [[ $# -lt 1 ]] || [[ "$@" = *--help ]] || [[ "$@" = *-h ]]; then
  echo "Usage: ./sbin/start-worker.sh <master> [options]"
  pattern="Usage:"
  pattern+="\|Using Spark's default log4j profile:"
  pattern+="\|Started daemon with process name"
  pattern+="\|Registered signal handler for"

  "${SPARK_HOME}"/bin/spark-class $CLASS --help 2>&1 | grep -v "$pattern" 1>&2
  exit 1
fi

. "${SPARK_HOME}/sbin/spark-config.sh"

. "${SPARK_HOME}/bin/load-spark-env.sh"

# First argument should be the master; we need to store it aside because we may
# need to insert arguments between it and the other arguments
MASTER=$1
shift

# Determine desired worker port
if [ "$SPARK_WORKER_WEBUI_PORT" = "" ]; then
  SPARK_WORKER_WEBUI_PORT=8081
fi

# Start up the appropriate number of workers on this machine.
# quick local function to start a worker
function start_instance {
  WORKER_NUM=$1
  shift

  if [ "$SPARK_WORKER_PORT" = "" ]; then
    PORT_FLAG=
    PORT_NUM=
  else
    PORT_FLAG="--port"
    PORT_NUM=$(( $SPARK_WORKER_PORT + $WORKER_NUM - 1 ))
  fi
  WEBUI_PORT=$(( $SPARK_WORKER_WEBUI_PORT + $WORKER_NUM - 1 ))

  "${SPARK_HOME}/sbin"/spark-daemon.sh start $CLASS $WORKER_NUM \
     --webui-port "$WEBUI_PORT" $PORT_FLAG $PORT_NUM $MASTER "$@"
}

if [ "$SPARK_WORKER_INSTANCES" = "" ]; then
  start_instance 1 "$@"
else
  for ((i=0; i<$SPARK_WORKER_INSTANCES; i++)); do
    start_instance $(( 1 + $i )) "$@"
  done
fi

主要逻辑:

  1. 获取 MASTER ,这里为 spark://indata-10-110-8-199.indata.com:7077
  2. 判断 SPARK_WORKER_INSTANCES 是否为空,默认为空,也就是默认一个 Worker 实例
  3. 调用 start_instance 1 “$@” ,主要逻辑计算每个示例的 PORT_NUM 和 WEBUI_PORT ,最后执行 “${SPARK_HOME}/sbin”/spark-daemon.sh start $CLASS $WORKER_NUM --webui-port “$WEBUI_PORT$PORT_FLAG $PORT_NUM $MASTER$@

具体的参数:/opt/dkl/spark-3.2.3-bin-hadoop3.2/sbin/spark-daemon.sh start org.apache.spark.deploy.worker.Worker 1 --webui-port 8081 spark://indata-10-110-8-199.indata.com:7077

spark-daemon.sh

根据上面的分析:master 和 worker 都是通过 spark-daemon.sh 来启动的。

# Runs a Spark command as a daemon.
#
# Environment Variables
#
#   SPARK_CONF_DIR  Alternate conf dir. Default is ${SPARK_HOME}/conf.
#   SPARK_LOG_DIR   Where log files are stored. ${SPARK_HOME}/logs by default.
#   SPARK_LOG_MAX_FILES Max log files of Spark daemons can rotate to. Default is 5.
#   SPARK_MASTER    host:path where spark code should be rsync'd from
#   SPARK_PID_DIR   The pid files are stored. /tmp by default.
#   SPARK_IDENT_STRING   A string representing this instance of spark. $USER by default
#   SPARK_NICENESS The scheduling priority for daemons. Defaults to 0.
#   SPARK_NO_DAEMONIZE   If set, will run the proposed command in the foreground. It will not output a PID file.
##

usage="Usage: spark-daemon.sh [--config <conf-dir>] (start|stop|submit|status) <spark-command> <spark-instance-number> <args...>"

# if no args specified, show usage
if [ $# -le 1 ]; then
  echo $usage
  exit 1
fi

if [ -z "${SPARK_HOME}" ]; then
  export SPARK_HOME="$(cd "`dirname "$0"`"/..; pwd)"
fi

. "${SPARK_HOME}/sbin/spark-config.sh"

# get arguments

# Check if --config is passed as an argument. It is an optional parameter.
# Exit if the argument is not a directory.
# 判断有没有参数 --config ,如果有,则配置 SPARK_CONF_DIR 等于参数值,默认 ${SPARK_HOME}/conf
if [ "$1" == "--config" ]
then
  shift
  conf_dir="$1"
  if [ ! -d "$conf_dir" ]
  then
    echo "ERROR : $conf_dir is not a directory"
    echo $usage
    exit 1
  else
    export SPARK_CONF_DIR="$conf_dir"
  fi
  shift
fi
# 获取 option ,可选值 start|stop|submit|status ,start-master 和 start-worker 对应的都为 start
option=$1
shift
# 获取 command , 这里的值对应具体的 class , start-master 对应 org.apache.spark.deploy.master.Master
# start-worker 对应  org.apache.spark.deploy.worker.Worker
command=$1
shift
# 获取 instance , 这里的值均为 1 , 注意这里的 instance 是前面传过来的参数,实例数,为了给后面的 log、pid用。
instance=$1
shift

spark_rotate_log ()
{
    log=$1;

    if [[ -z ${SPARK_LOG_MAX_FILES} ]]; then
      num=5
    elif [[ ${SPARK_LOG_MAX_FILES} -gt 0 ]]; then
      num=${SPARK_LOG_MAX_FILES}
    else
      echo "Error: SPARK_LOG_MAX_FILES must be a positive number, but got ${SPARK_LOG_MAX_FILES}"
      exit -1
    fi

    if [ -f "$log" ]; then # rotate logs
	while [ $num -gt 1 ]; do
	    prev=`expr $num - 1`
	    [ -f "$log.$prev" ] && mv "$log.$prev" "$log.$num"
	    num=$prev
	done
	mv "$log" "$log.$num";
    fi
}

. "${SPARK_HOME}/bin/load-spark-env.sh"

if [ "$SPARK_IDENT_STRING" = "" ]; then
  export SPARK_IDENT_STRING="$USER"
fi


export SPARK_PRINT_LAUNCH_COMMAND="1"

# get log directory
if [ "$SPARK_LOG_DIR" = "" ]; then
  export SPARK_LOG_DIR="${SPARK_HOME}/logs"
fi
mkdir -p "$SPARK_LOG_DIR"
touch "$SPARK_LOG_DIR"/.spark_test > /dev/null 2>&1
TEST_LOG_DIR=$?
if [ "${TEST_LOG_DIR}" = "0" ]; then
  rm -f "$SPARK_LOG_DIR"/.spark_test
else
  chown "$SPARK_IDENT_STRING" "$SPARK_LOG_DIR"
fi

if [ "$SPARK_PID_DIR" = "" ]; then
  SPARK_PID_DIR=/tmp
fi

# some variables
# 配置 log、pid 的文件路径和文件名 。log 默认路径 ${SPARK_HOME}/logs , pid 默认路径 /tmp
log="$SPARK_LOG_DIR/spark-$SPARK_IDENT_STRING-$command-$instance-$HOSTNAME.out"
pid="$SPARK_PID_DIR/spark-$SPARK_IDENT_STRING-$command-$instance.pid"

# Set default scheduling priority
# 设置 SPARK_NICENESS :守护进程的调度优先级,如果 SPARK_NICENESS 为空,则设置默认值为0
# 注意这里的 SPARK_NICENESS 与前面的 instance 不同
if [ "$SPARK_NICENESS" = "" ]; then
    export SPARK_NICENESS=0
fi

execute_command() {
  # -z 判断 ${SPARK_NO_DAEMONIZE+set} 是否为空,这里为空
  if [ -z ${SPARK_NO_DAEMONIZE+set} ]; then
      # 这里主要逻辑是执行 $@ 并将日志输出到对应的日志文件中, $@ : 所有脚本参数的内容
      # 实际是通过 nice -n 0 设置进程优先级,然后通过 spark-class 启动对应的 Master 和 Worker 类
      nohup -- "$@" >> $log 2>&1 < /dev/null &
      # 将上面返回的进程号赋给 newpid($! :Shell最后运行的后台Process的PID)
      newpid="$!"
      # 然后将 newpid 写到对应的 pid 文件中。
      echo "$newpid" > "$pid"

      # Poll for up to 5 seconds for the java process to start
      # for 循环 1到10,轮询最多5秒,以启动java进程
      for i in {1..10}
      do
         # 每次判断 newpid 对应的java 进程是否启动成功
        if [[ $(ps -p "$newpid" -o comm=) =~ "java" ]]; then
           # 如果启动成功则终止循环
           break
        fi
        # 否则sleep 0.5 ,继续下次循环
        sleep 0.5
      done
      # 启动成功后,sleep 2秒
      sleep 2
      # Check if the process has died; in that case we'll tail the log so the user can see
      # 判断对应的java进程是否还存在,如果不存在,则提示启动失败,并打印对应的日志
      if [[ ! $(ps -p "$newpid" -o comm=) =~ "java" ]]; then
        echo "failed to launch: $@"
        tail -10 "$log" | sed 's/^/  /'
        echo "full log in $log"
      fi
  else
      "$@"
  fi
}

run_command() {
  # 先获取 mode ,这里 mode 为 class ,
  mode="$1"
  shift
  # 创建 pid 文件夹
  mkdir -p "$SPARK_PID_DIR"
  # 判断 pid 文件是否存在
  if [ -f "$pid" ]; then
    # 如果存在,则获取pid值
    TARGET_ID="$(cat "$pid")"
    # 判断是否存在对应的java进程
    if [[ $(ps -p "$TARGET_ID" -o comm=) =~ "java" ]]; then
      # 如果存在,则提示服务已经运行,先停止它
      echo "$command running as process $TARGET_ID.  Stop it first."
      exit 1
    fi
  fi

  if [ "$SPARK_MASTER" != "" ]; then
    echo rsync from "$SPARK_MASTER"
    rsync -a -e ssh --delete --exclude=.svn --exclude='logs/*' --exclude='contrib/hod/logs/*' "$SPARK_MASTER/" "${SPARK_HOME}"
  fi

  spark_rotate_log "$log"
  echo "starting $command, logging to $log"
  # 匹配 mode 值
  case "$mode" in
    # 如果是class
    (class)
      #  nice -n "$SPARK_NICENESS" 是设置进程优先级,范围通常从 -20(最高优先级)到 +19(最低优先级)。默认的 nice 值是 0。
      #  可参考 https://www.cnblogs.com/yinguojin/p/18600924
      execute_command nice -n "$SPARK_NICENESS" "${SPARK_HOME}"/bin/spark-class "$command" "$@"
      ;;

    (submit)
      execute_command nice -n "$SPARK_NICENESS" bash "${SPARK_HOME}"/bin/spark-submit --class "$command" "$@"
      ;;

    (*)
      echo "unknown mode: $mode"
      exit 1
      ;;
  esac

}
# 匹配 option ,
case $option in
  # 如果为 submit ,执行 run_command submit "$@"
  (submit)
    run_command submit "$@"
    ;;
  # 如果为 start, 执行 run_command class "$@"
  (start)
    run_command class "$@"
    ;;

  (stop)

    if [ -f $pid ]; then
      TARGET_ID="$(cat "$pid")"
      if [[ $(ps -p "$TARGET_ID" -o comm=) =~ "java" ]]; then
        echo "stopping $command"
        kill "$TARGET_ID" && rm -f "$pid"
      else
        echo "no $command to stop"
      fi
    else
      echo "no $command to stop"
    fi
    ;;

  (decommission)

    if [ -f $pid ]; then
      TARGET_ID="$(cat "$pid")"
      if [[ $(ps -p "$TARGET_ID" -o comm=) =~ "java" ]]; then
        echo "decommissioning $command"
        kill -s SIGPWR "$TARGET_ID"
      else
        echo "no $command to decommission"
      fi
    else
      echo "no $command to decommission"
    fi
    ;;

  (status)

    if [ -f $pid ]; then
      TARGET_ID="$(cat "$pid")"
      if [[ $(ps -p "$TARGET_ID" -o comm=) =~ "java" ]]; then
        echo $command is running.
        exit 0
      else
        echo $pid file is present but $command not running
        exit 1
      fi
    else
      echo $command not running.
      exit 2
    fi
    ;;

  (*)
    echo $usage
    exit 1
    ;;

esac

主要逻辑:

  • 判断有没有参数 --config ,如果有,则配置 SPARK_CONF_DIR 等于参数值,默认 ${SPARK_HOME}/conf
  • 获取 option ,可选值 start|stop|submit|status ,start-master 和 start-worker 对应的都为 start
  • 获取 command , 这里的值对应具体的 class , start-master 对应 org.apache.spark.deploy.master.Master ,start-worker 对应 org.apache.spark.deploy.worker.Worker
  • 获取 instance , 这里的值均为 1 , 实例数,为了给后面的 log、pid用。
  • 配置 log、pid 的文件路径和文件名 。log 默认路径 ${SPARK_HOME}/logs , pid 默认路径 /tmp
  • 设置 SPARK_NICENESS :守护进程的调度优先级,如果 SPARK_NICENESS 为空,则设置默认值为0 。
  • 匹配 option ,如果为 submit ,执行 run_command submit “$@” ;如果为 start, 执行 run_command class “$@” ; …… ,这里只看 start
  • run_command 逻辑:
    • 先获取 mode ,这里 mode 为 class ,
    • 创建 pid 文件夹, 判断 pid 文件是否存在,如果存在,则获取pid值并判断是否存在对应的java进程,如果存在,则提示服务已经运行,先停止它
    • 如果不存在,则匹配 mode 值,如果是class ,则执行 execute_command nice -n “$SPARK_NICENESS” “${SPARK_HOME}”/bin/spark-class “$command” “$@
    • execute_command 是这里自定义的函数, nice -n “$SPARK_NICENESS” 是设置进程优先级,范围通常从 -20(最高优先级)到 +19(最低优先级)。默认的 nice 值是 0。可参考 https://www.cnblogs.com/yinguojin/p/18600924
  • execute_command 逻辑 :
    • -z 判断 ${SPARK_NO_DAEMONIZE+set} 是否为空,这里为空
    • 执行 nohup – “$@” >> $log 2>&1 < /dev/null & ,这里主要逻辑是执行 $@ 并将日志输出到对应的日志文件中, $@ : 所有脚本参数的内容, 这里为 : nice -n 0 /opt/dkl/spark-3.2.3-bin-hadoop3.2/bin/spark-class org.apache.spark.deploy.master.Master --host indata-10-110-8-199.indata.com --port 7077 --webui-port 8080 和 nice -n 0 /opt/dkl/spark-3.2.3-bin-hadoop3.2/bin/spark-class org.apache.spark.deploy.worker.Worker --webui-port 8081 spark://indata-10-110-8-199.indata.com:7077
      实际是通过 nice -n 0 设置进程优先级,然后通过 spark-class 启动对应的 Master 和 Worker 类
    • 将上面返回的进程号赋给 newpid ,然后将 newpid 写到对应的 pid 文件中。($! :Shell最后运行的后台Process的PID)
    • for 循环 1到10,每次判断 newpid 对应的java 进程是否启动成功,如果启动成功则终止循环,否则sleep 0.5 ,继续下次循环,也就是轮询最多5秒,以启动java进程
    • 启动成功后,sleep 2秒,然后判断对应的java进程是否还存在,如果不存在,则提示启动失败,并打印对应的日志

spark-class

通过上面的分析可知:spark-sql、spark-shell、Master 和 Worker 的启动最终都是通过 spark-class 启动的,具体分别为:

/opt/dkl/spark-3.2.3-bin-hadoop3.2/bin/spark-class org.apache.spark.deploy.SparkSubmit --class org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver
/opt/dkl/spark-3.2.3-bin-hadoop3.2/bin/spark-class org.apache.spark.deploy.SparkSubmit --class org.apache.spark.repl.Main --name Spark shell
nice -n 0 /opt/dkl/spark-3.2.3-bin-hadoop3.2/bin/spark-class org.apache.spark.deploy.master.Master --host indata-10-110-8-199.indata.com --port 7077 --webui-port 8080
nice -n 0 /opt/dkl/spark-3.2.3-bin-hadoop3.2/bin/spark-class org.apache.spark.deploy.worker.Worker --webui-port 8081 spark://indata-10-110-8-199.indata.com:7077
if [ -z "${SPARK_HOME}" ]; then
  source "$(dirname "$0")"/find-spark-home
fi

. "${SPARK_HOME}"/bin/load-spark-env.sh

# Find the java binary
# 找Java环境变量,如果有,则拼接 ${JAVA_HOME}/bin/java
if [ -n "${JAVA_HOME}" ]; then
  RUNNER="${JAVA_HOME}/bin/java"
else
  if [ "$(command -v java)" ]; then
    RUNNER="java"
  else
    echo "JAVA_HOME is not set" >&2
    exit 1
  fi
fi

# Find Spark jars.
# 找 Spark jars
if [ -d "${SPARK_HOME}/jars" ]; then
  SPARK_JARS_DIR="${SPARK_HOME}/jars"
else
  SPARK_JARS_DIR="${SPARK_HOME}/assembly/target/scala-$SPARK_SCALA_VERSION/jars"
fi

if [ ! -d "$SPARK_JARS_DIR" ] && [ -z "$SPARK_TESTING$SPARK_SQL_TESTING" ]; then
  echo "Failed to find Spark jars directory ($SPARK_JARS_DIR)." 1>&2
  echo "You need to build Spark with the target \"package\" before running this program." 1>&2
  exit 1
else
  LAUNCH_CLASSPATH="$SPARK_JARS_DIR/*"
fi

# Add the launcher build dir to the classpath if requested.
if [ -n "$SPARK_PREPEND_CLASSES" ]; then
  LAUNCH_CLASSPATH="${SPARK_HOME}/launcher/target/scala-$SPARK_SCALA_VERSION/classes:$LAUNCH_CLASSPATH"
fi

# For tests
if [[ -n "$SPARK_TESTING" ]]; then
  unset YARN_CONF_DIR
  unset HADOOP_CONF_DIR
fi

# The launcher library will print arguments separated by a NULL character, to allow arguments with
# characters that would be otherwise interpreted by the shell. Read that in a while loop, populating
# an array that will be used to exec the final command.
#
# The exit code of the launcher is appended to the output, so the parent shell removes it from the
# command array and checks the value to see if the launcher succeeded.
build_command() {
  # 通过java -cp 提交 org.apache.spark.launcher.Main "$@" , 该类会打印对应的命令
  # 具体打印 : 先打印'\0'并换行,然后打印拼接的命令,每个命令后跟'\0',不换行。
  "$RUNNER" -Xmx128m $SPARK_LAUNCHER_OPTS -cp "$LAUNCH_CLASSPATH" org.apache.spark.launcher.Main "$@"
  # 打印上面的命令的退出状态码并后跟一个\0,没有换行
  printf "%d\0" $?
}

# Turn off posix mode since it does not allow process substitution
set +o posix
# 定义CMD数组
CMD=()
# -d : 输入结束符,分隔符
DELIM=$'\n'
# CMD 开始标志,默认false
CMD_START_FLAG="false"
# -r : 反斜杠转义不会生效,意味着行末的’\’成为有效的字符,例如使 \n 成为有效字符而不是换行
# 执行 build_command "$@",并解析其输出结果
# Bash read 命令可以参考 https://blog.csdn.net/lingeio/article/details/96587362
# 先解析完第一行,然后将 CMD_START_FLAG 设置为 true,开始拼接 CMD ,后以 '' 为分隔符分割具体命令,放到CMD数组中。
while IFS= read -d "$DELIM" -r ARG; do
  # 当 CMD_START_FLAG = true , 拼接 cmd 命令
  if [ "$CMD_START_FLAG" == "true" ]; then
    CMD+=("$ARG")
  else
    # 开始以 $'\n' 为结束符, ARG = $'\0'
    if [ "$ARG" == $'\0' ]; then
      # After NULL character is consumed, change the delimiter and consume command string.
      # 修改 DELIM='' ,并将CMD_START_FLAG设置为true,意味着开始拼接 cmd 命令
      DELIM=''
      CMD_START_FLAG="true"
    elif [ "$ARG" != "" ]; then
      echo "$ARG"
    fi
  fi
done < <(build_command "$@")
# CMD数组长度
COUNT=${#CMD[@]}
LAST=$((COUNT - 1))
# 获取CMD最后一个值作为 LAUNCHER_EXIT_CODE
LAUNCHER_EXIT_CODE=${CMD[$LAST]}

# Certain JVM failures result in errors being printed to stdout (instead of stderr), which causes
# the code that parses the output of the launcher to get confused. In those cases, check if the
# exit code is an integer, and if it's not, handle it as a special error case.
# 检查 LAUNCHER_EXIT_CODE 是否正常,如果不正常,则进行相应处理并退出
if ! [[ $LAUNCHER_EXIT_CODE =~ ^[0-9]+$ ]]; then
  echo "${CMD[@]}" | head -n-1 1>&2
  exit 1
fi

if [ $LAUNCHER_EXIT_CODE != 0 ]; then
  exit $LAUNCHER_EXIT_CODE
fi
# 代表从下标0开始往后取 LAST 个值
# 数组用法可参考 https://blog.csdn.net/trayvontang/article/details/143440654
CMD=("${CMD[@]:0:$LAST}")
# 如果正常,则执行  org.apache.spark.launcher.Main 返回的命令
exec "${CMD[@]}"

主要逻辑:

  • 找Java环境变量,如果有,则拼接 ${JAVA_HOME}/bin/java
  • 执行 build_command “$@” ,并打印输出结果。对应命令:“$RUNNER” -Xmx128m $SPARK_LAUNCHER_OPTS -cp “$LAUNCH_CLASSPATH” org.apache.spark.launcher.Main “$@” 具体为:
    • spark-sql 对应命令 :/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-7.b13.el7.x86_64/bin/java -Xmx128m -cp /opt/dkl/spark-3.2.3-bin-hadoop3.2/jars/* org.apache.spark.launcher.Main org.apache.spark.deploy.SparkSubmit --class org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver
    • spark-shell 对应命令 :/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-7.b13.el7.x86_64/bin/java -Xmx128m -cp /opt/dkl/spark-3.2.3-bin-hadoop3.2/jars/* org.apache.spark.launcher.Main org.apache.spark.deploy.SparkSubmit --class org.apache.spark.repl.Main --name Spark shell
    • start-master 对应命令 :/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-7.b13.el7.x86_64/bin/java -Xmx128m -cp /opt/dkl/spark-3.2.3-bin-hadoop3.2/jars/* org.apache.spark.launcher.Main org.apache.spark.deploy.master.Master --host indata-10-110-8-199.indata.com --port 7077 --webui-port 8080
    • start-worer 对应命令 : /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-7.b13.el7.x86_64/bin/java -Xmx128m -cp /opt/dkl/spark-3.2.3-bin-hadoop3.2/jars/* org.apache.spark.launcher.Main org.apache.spark.deploy.worker.Worker --webui-port 8081 spark://indata-10-110-8-199.indata.com:7077
    • org.apache.spark.launcher.Main 主要逻辑 : 根据传入的参数,拼接命令并打印
    System.out.println('\0');
    List<String> bashCmd = prepareBashCommand(cmd, env);
    for (String c : bashCmd) {
      System.out.print(c);
      System.out.print('\0');
    }
    
    最后打印代码如上,先打印’\0’并换行,然后打印拼接的命令,每个命令后跟’\0’,不换行。最后打印的命令分别为:
    • spark-sql :

      /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-7.b13.el7.x86_64/bin/java-cp/opt/dkl/spark-3.2.3-bin-hadoop3.2/conf/:/opt/dkl/spark-3.2.3-bin-hadoop3.2/jars/*-Xmx1gorg.apache.spark.deploy.SparkSubmit–classorg.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriverspark-internal
      一共两行,第一行是空字符串 \0’表示字符串结束符,第二行是具体拼接的命令,每一个命令后跟 ‘\0’ ,因为是空所以没有空格分隔开,第二行最后没有跟换行。后面的命令第一行一样,所以只记录第二行

    • spark-shell :
      /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-7.b13.el7.x86_64/bin/java-cp/opt/dkl/spark-3.2.3-bin-hadoop3.2/conf/:/opt/dkl/spark-3.2.3-bin-hadoop3.2/jars/*-Dscala.usejavacp=true-Xmx1gorg.apache.spark.deploy.SparkSubmit–classorg.apache.spark.repl.Main–nameSpark shellspark-shell

    • start-master :
      /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-7.b13.el7.x86_64/bin/java-cp/opt/dkl/spark-3.2.3-bin-hadoop3.2/conf/:/opt/dkl/spark-3.2.3-bin-hadoop3.2/jars/*-Xmx1gorg.apache.spark.deploy.master.Master–hostindata-10-110-8-199.indata.com–port7077–webui-port8080

    • start-worker :
      /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-7.b13.el7.x86_64/bin/java-cp/opt/dkl/spark-3.2.3-bin-hadoop3.2/conf/:/opt/dkl/spark-3.2.3-bin-hadoop3.2/jars/*-Xmx1gorg.apache.spark.deploy.worker.Worker–webui-port8081spark://indata-10-110-8-199.indata.com:7077

    • build_command 最后执行 printf “%d\0” $? ,这和脚本含义是打印 $? 后面再跟 \0 ,$? 是一个特殊的变量,用于获取上一个命令的退出状态码

      • 0:命令成功执行
      • 0以外的数字:命令执行失败。
      • 1:通用错误(General error), 发生了一个通用的错误,但没有提供具体的错误信息。
      • 2:误用shell内置命令(Misuse of shell built-ins)
      • 126:命令不可执行(Command invoked cannot execute)
      • 127:未找到命令(Command not found)
        所以build_command最终输出:以spark-sql为例:
        /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-7.b13.el7.x86_64/bin/java-cp/opt/dkl/spark-3.2.3-bin-hadoop3.2/conf/:/opt/dkl/spark-3.2.3-bin-hadoop3.2/jars/-Xmx1gorg.apache.spark.deploy.SparkSubmit–classorg.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriverspark-internal0
        \0 输出到日志文件中,以vi命令看,会显示^@,下面是在日志文件中vi查看的效果:
        ^@
        /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-7.b13.el7.x86_64/bin/java^@-cp^@/opt/dkl/spark-3.2.3-bin-hadoop3.2/conf/:/opt/dkl/spark-3.2.3-bin-hadoop3.2/jars/
        ^@-Xmx1g^@org.apache.spark.deploy.SparkSubmit^@–class^@org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver^@spark-internal^@0^@
        这样更能方便的理解 build_command 的输出是啥样的,方便后面的脚本分析,然后我们将 ^@ 换成空格:
        /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-7.b13.el7.x86_64/bin/java -cp /opt/dkl/spark-3.2.3-bin-hadoop3.2/conf/:/opt/dkl/spark-3.2.3-bin-hadoop3.2/jars/* -Xmx1g org.apache.spark.deploy.SparkSubmit --class org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver spark-internal 0
    • 最后解释一下 printf “%d\0” $? %d 代表数字,该行脚本的含义是将上一个命令的退出状态码打印并后跟一个\0,该命令没有换行,另外测试发现 %d 如果后面没有跟具体数字则默认值为0,可以通过 printf “\0” > test.log , 然后 vi test.log 查看 \0 在文件中会显示 ^@ ,但是cat test.log 则会显示空字符串。

  • 后面 while IFS= read -d “$DELIM” -r ARG; 是通过 read 命令读取 build_command “$@” 输出的结果,Bash read 命令可以参考 https://blog.csdn.net/lingeio/article/details/96587362 。 主要逻辑是读取第2步中输出的结果,解析对应的命令并放到 CMD 数组中,首先解析完第一行,然后将 CMD_START_FLAG 设置为 true开始拼接 CMD ,后以 ‘’ 为分隔符分割具体命令,放到CMD数组中。
  • 组装好CMD数组后,先取CMD数组长度,获取CMD最后一个值作为 LAUNCHER_EXIT_CODE,即为在 build_command 中 执行 “$RUNNER” -Xmx128m $SPARK_LAUNCHER_OPTS -cp “$LAUNCH_CLASSPATH” org.apache.spark.launcher.Main “$@” 的命令退出状态码。
  • 检查 LAUNCHER_EXIT_CODE 是否正常,如果不正常,则进行相应处理并退出,如果正常,则执行 org.apache.spark.launcher.Main 返回的命令 。

小结

通过本文上面的简单分析可知,无论是 spark-sql 、 spark-shell 这种交互式命令行,还是 Master 和 Worker 等Stanalone服务的启动,最终都是通过 spark-class 启动的。而 spark-class 的逻辑则是先通过 java -cp 执行类 org.apache.spark.launcher.Main,然后将拼接好的启动命令打印输出,最终在 spark-class 中解析输出的命令并执行,最终也都是通过 java -cp 执行具体的类的,分别如下:

  • spark-sql : /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-7.b13.el7.x86_64/bin/java -cp /opt/dkl/spark-3.2.3-bin-hadoop3.2/conf/:/opt/dkl/spark-3.2.3-bin-hadoop3.2/jars/* -Xmx1g org.apache.spark.deploy.SparkSubmit --class org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver spark-internal
  • spark-shell : /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-7.b13.el7.x86_64/bin/java -cp /opt/dkl/spark-3.2.3-bin-hadoop3.2/conf/:/opt/dkl/spark-3.2.3-bin-hadoop3.2/jars/* -Dscala.usejavacp=true -Xmx1g org.apache.spark.deploy.SparkSubmit --class org.apache.spark.repl.Main --name Spark shell spark-shell
  • start-master : /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-7.b13.el7.x86_64/bin/java -cp /opt/dkl/spark-3.2.3-bin-hadoop3.2/conf/:/opt/dkl/spark-3.2.3-bin-hadoop3.2/jars/* -Xmx1g org.apache.spark.deploy.master.Master --host indata-10-110-8-199.indata.com --port 7077 --webui-port 8080
  • start-worker : /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-7.b13.el7.x86_64/bin/java -cp /opt/dkl/spark-3.2.3-bin-hadoop3.2/conf/:/opt/dkl/spark-3.2.3-bin-hadoop3.2/jars/* -Xmx1g org.apache.spark.deploy.worker.Worker --webui-port 8081 spark://indata-10-110-8-199.indata.com:7077
    当然我们提交程序代码jar也是一样的,比如 spark-submit --master local --class org.apache.spark.examples.SparkPi examples/jars/spark-examples_2.12-3.2.3.jar , 对应到 spark-class 的提交命令为 :
    /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.181-7.b13.el7.x86_64/bin/java -cp /opt/dkl/spark-3.2.3-bin-hadoop3.2/conf/:/opt/dkl/spark-3.2.3-bin-hadoop3.2/jars/* -Xmx1g org.apache.spark.deploy.SparkSubmit --master local --class org.apache.spark.examples.SparkPi examples/jars/spark-examples_2.12-3.2.3.jar

进一步总结发现,关于服务类的启动都是直接通过 java -cp 提交具体的类,其他的交互式命令行、jar 则是先通过 java -cp 提交 org.apache.spark.deploy.SparkSubmit ,最终具体执行的类则通过 --class 作为参数提交。那么下次我们先分析 org.apache.spark.deploy.SparkSubmit,看看最终真正的 class 是怎么提交运行的。

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:/a/968068.html

如若内容造成侵权/违法违规/事实不符,请联系我们进行投诉反馈qq邮箱809451989@qq.com,一经查实,立即删除!

相关文章

基于蜘蛛蜂优化算法的无人机集群三维路径规划Matlab实现

代码下载&#xff1a;私信博主回复基于蜘蛛蜂优化算法的无人机集群三维路径规划Matlab实现 《基于蜘蛛蜂优化算法的无人机集群三维路径规划》 摘要 本研究针对无人机集群三维路径规划问题&#xff0c;提出了一种基于蜘蛛蜂优化算法的解决方案。以5个无人机构成的集群为研究对…

路由过滤方法与常用工具

路由过滤 定义&#xff1a;路由器在发布或者接收消息时&#xff0c;可能需要对路由信息进行过滤。 作用&#xff1a;控制路由的传播与生成&#xff1b;节省设备和链路资源消耗&#xff0c;保护网络安全。 举例&#xff1a;学习汇总后的路由&#xff0c;而不学习汇总时的明细路由…

仿 RabbitMQ 实现的简易消息队列

文章目录 项目介绍开放环境第三⽅库介绍ProtobufMuduo库 需求分析核⼼概念实现内容 消息队列系统整体框架服务端模块数据管理模块虚拟机数据管理模块交换路由模块消费者管理模块信道&#xff08;通信通道&#xff09;管理模块连接管理模块 客户端模块 公共模块日志类其他工具类…

【天梯赛】L1-104 九宫格(C++)

易忽略的错误&#xff1a;开始习惯性地看到n就以为是n*n数组了&#xff0c;实际上应该是9*9的固定大小数组&#xff0c;查了半天没查出来 题面 L1-104 九宫格 - 团体程序设计天梯赛-练习集 代码实现 #include<bits/stdc.h> using namespace std; //易错&#xff1a;开…

CSS 小技巧 —— CSS 实现 Tooltip 功能-鼠标 hover 之后出现弹层

CSS 小技巧 —— CSS 实现 Tooltip 功能-鼠标 hover 之后出现弹层 1. 两个元素实现 <!DOCTYPE html> <html lang"zh-CN"> <head><meta charset"UTF-8"><title>纯 CSS 实现 Tooltip 功能-鼠标 hover 之后出现弹层</titl…

【转载】开源鸿蒙OpenHarmony社区运营报告(2025年1月)

●截至2025年1月31日&#xff0c;开放原子开源鸿蒙&#xff08;OpenAtom OpenHarmony&#xff0c;简称“开源鸿蒙”或“OpenHarmony”&#xff09;社区累计超过8200名贡献者&#xff0c;共63家成员单位&#xff0c;产生51.2万多个PR、2.9万多个Star、10.5万多个Fork、68个SIG。…

03:Spring之Web

一&#xff1a;Spring整合web环境 1&#xff1a;web的三大组件 Servlet&#xff1a;核心组件&#xff0c;负责处理请求和生成响应。 Filter&#xff1a;用于请求和响应的预处理和后处理&#xff0c;增强功能。 Listener&#xff1a;用于监听 Web 应用中的事件&#xff0c;实…

ASP.NET Core 如何使用 C# 向端点发出 POST 请求

使用 C#&#xff0c;将 JSON POST 到 REST API 端点&#xff1b;如何从 REST API 接收 JSON 数据。 本文需要 ASP .NET Core&#xff0c;并兼容 .NET Core 3.1、.NET 6和.NET 8。 要从端点获取数据&#xff0c;请参阅本文。 使用 . 将 JSON 数据发布到端点非常容易HttpClien…

大语言模型需要的可观测性数据的关联方式

可观测性数据的关联方式及其优缺点 随着现代分布式架构和微服务的普及&#xff0c;可观测性&#xff08;Observability&#xff09;已经成为确保系统健康、排查故障、优化性能的重要组成部分。有效的可观测性数据关联方式不仅能够帮助我们实时监控系统的运行状态&#xff0c;还…

渗透利器:Burp Suite 联动 XRAY 图形化工具.(主动扫描+被动扫描)

Burp Suite 联动 XRAY 图形化工具.&#xff08;主动扫描被动扫描&#xff09; Burp Suite 和 Xray 联合使用&#xff0c;能够将 Burp 的强大流量拦截与修改功能&#xff0c;与 Xray 的高效漏洞检测能力相结合&#xff0c;实现更全面、高效的网络安全测试&#xff0c;同时提升漏…

C语言_通讯录

“我若成佛&#xff0c;天下无魔&#xff1b;我若成魔&#xff0c;佛奈我何。” “小爷是魔&#xff0c;那又如何&#xff1f;” 下面我和一起来攻克通讯录的难关&#xff01;&#xff01; 明确通讯录的基本结构 实现一个通讯录: 人的信息: 名字年龄性别电话地址 实现通讯录的…

STM32 Flash详解教程文章

目录 Flash基本概念理解 Flash编程接口FPEC Flash擦除/写入流程图 Flash选项字节基本概念理解 Flash电子签名 函数读取地址下存放的数据 Flash的数据处理限制部分 编写不易&#xff0c;请勿搬运&#xff0c;感谢理解&#xff01;&#xff01;&#xff01; Flash基本概念…

Flutter项目试水

1基本介绍 本文章在构建您的第一个 Flutter 应用指导下进行实践 可作为项目实践的辅助参考资料 Flutter 是 Google 的界面工具包&#xff0c;用于通过单一代码库针对移动设备、Web 和桌面设备构建应用。在此 Codelab 中&#xff0c;您将构建以下 Flutter 应用。 该应用可以…

第六篇:数字逻辑的“矩阵革命”——域控制器中的组合电路设计

副标题 &#xff1a;用卡诺图破解车身域控制器的逻辑迷宫&#xff0c;揭秘华为DriveONE的“数字特工” ▍ 开篇&#xff1a;黑客帝国世界观映射 > "Welcome to the Real World." —— Morpheus > 在数字逻辑的世界里&#xff0c;组合电路就是构建Matr…

成为高能量体质:从身体神庙到精神圣殿的修炼之路

清晨五点&#xff0c;当城市还在沉睡&#xff0c;瑜伽垫上的汗水已经折射出第一缕阳光。这不是苦行僧的自虐&#xff0c;而是高能量体质者的日常仪式。在这个能量稀缺的时代&#xff0c;如何把自己修炼成一座小型核电站&#xff1f;答案就藏在身体的每个细胞里。 一、能量管理…

从大规模恶意攻击 DeepSeek 事件看 AI 创新隐忧:安全可观测体系建设刻不容缓

作者&#xff1a;羿莉&#xff08;萧羿&#xff09; 全球出圈的中国大模型 DeepSeek 作为一款革命性的大型语言模型&#xff0c;以其卓越的自然语言处理能力和创新性成本控制引领行业前沿。该模型不仅在性能上媲美 OpenAI-o1&#xff0c;而且在推理模型的成本优化上实现了突破…

低成本+高性能+超灵活!Deepseek 671B+Milvus重新定义知识库搭建

“老板说&#xff0c;这个项目得上Deepseek,还得再做个知识库...” 还有哪个开发者&#xff0c;最近没听到这样的抱怨&#xff1f; Deepseek爆火&#xff0c;推理端的智能提速&#xff0c;算力成本急剧下降&#xff0c;让不少原本不想用大模型&#xff0c;用不起大模型的企业&a…

CSS 实现下拉菜单效果实例解析

1. 引言 在 Web 开发过程中&#xff0c;下拉菜单是一种常见且十分实用的交互组件。很多前端教程都提供过简单的下拉菜单示例&#xff0c;本文将以一个简洁的实例为出发点&#xff0c;从 HTML 结构、CSS 样式以及整体交互逻辑三个层面进行详细解析&#xff0c;帮助大家理解纯 C…

VSCode中出现“#include错误,请更新includePath“问题,解决方法

1、出现的问题 在编写C程序时&#xff0c;想引用头文件但是出现如下提示&#xff1a; &#xff08;1&#xff09;首先检查要引用的头文件是否存在&#xff0c;位于哪里。 &#xff08;2&#xff09;如果头文件存在&#xff0c;在编译时提醒VSCode终端中"#include错误&am…

【RabbitMQ的监听器容器Simple和Direct】 实现和场景区别

在Spring Boot中&#xff0c;RabbitMQ的两种监听器容器&#xff08;SimpleMessageListenerContainer和DirectMessageListenerContainer&#xff09;在实现机制和使用场景上有显著差异。以下是它们的核心区别、配置方式及最佳实践&#xff1a; Simple类型 Direct类型 一、核心…