why's blog

kuberenetes的pod优雅终止

时间：Aug. 19, 2022 分类：容器

kubernetes优雅停止

Pod在销毁的时候，可能还存在正在执行中的任务，如何保证这些执行中的任务正常完成之后Pod才进行销毁呢

容器销毁流程

Pod被删除，状态置为Terminating
Pod的状态变化，kube-proxy获取到更新转发规则，将Pod从service的endpoint列表中摘除掉，这样新的流量不再转发到该Pod
如果Pod配置了preStop Hook ，将会执行
kubelet对Pod中各个container发送SIGTERM信号以通知容器进程开始优雅停止
等待容器进程完全停止，如果在terminationGracePeriodSeconds内 (默认为30s) 还未完全停止，就发送SIGKILL信号强制杀死进程
所有容器进程终止，清理Pod资源

优雅终止

我们可以控制的主要就是三个

preStop
程序对SIGTERM信号的处理
terminationGracePeriodSeconds

preStop

一种情况，Pod在设置为Terminating的时候，仍然会有新的连接被转发过来，因为kube-proxy和kubelet同时watch的Pod状态，如果kubelet先进行操作停止容器，而kube-proxy还没操作就会有流量进入，但是这时候pod不能正常提供服务

这样需要在kubelet操作之前预留一定时间，例如sleep 5s

        lifecycle:
          preStop:
            exec:
              command:
              - sleep
              - 5s

另一种情况，Pod在响应请求的时候，被kubelet终止导致了没有提供响应，可以

针对SIGTERM信号的处理
通过脚本对服务情况进行检测
将sleep时间设置为最大超时时间+5s

        lifecycle:
          preStop:
            exec:
              command:
              - /clean.sh

对SIGTERM信号处理

在容器启动的时候，如果使用了脚本的方式，业务进程就是shell的子进程，SIGTERM信号就可能不能传递到业务进程

exec启动

exec可以让启动的进程替换主进程

#! /bin/bash
...

exec yourapp

捕获信号

这种一般用于多个进程，但是在kubernetes不建议

#! /bin/bash

/bin/app1 & pid1="$!" # 启动第一个业务进程并记录 pid
echo "app1 started with pid $pid1"

/bin/app2 & pid2="$!" # 启动第二个业务进程并记录 pid
echo "app2 started with pid $pid2"

handle_sigterm() {
  echo "[INFO] Received SIGTERM"
  kill -SIGTERM $pid1 $pid2 # 传递 SIGTERM 给业务进程
  wait $pid1 $pid2 # 等待所有业务进程完全终止
}
trap handle_sigterm SIGTERM # 捕获 SIGTERM 信号并回调 handle_sigterm 函数

wait # 等待回调执行完，主进程再退出

dumb-init

参考之前的dumb-init

terminationGracePeriodSeconds

在spec.template.spec.terminationGracePeriodSeconds配置即可

程序接收SIGTERM

Python

import signal, time, os

def shutdown(signum, frame):
    print('Caught SIGTERM, shutting down')
    # Finish any outstanding requests, then...
    exit(0)

if __name__ == '__main__':
    # Register handler
    signal.signal(signal.SIGTERM, shutdown)
    # Main logic goes here

Go

package main

import (
    "fmt"
    "os"
    "os/signal"
    "syscall"
)

func main() {

    sigs := make(chan os.Signal, 1)
    done := make(chan bool, 1)
    //registers the channel
    signal.Notify(sigs, syscall.SIGTERM)

    go func() {
        sig := <-sigs
        fmt.Println("Caught SIGTERM, shutting down")
        // Finish any outstanding requests, then...
        done <- true
    }()

    fmt.Println("Starting application")
    // Main logic goes here
    <-done
    fmt.Println("exiting")
}

对于istio

如果有istio等sidecar容器，Pod终止的时候istio容器会在默认的5s后终止

所以要等待业务容器

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      annotations:
        proxy.istio.io/config: |
          terminationDrainDuration: 60s # 这里自定义 Envoy 优雅终止时长
      labels:
        app: nginx
    spec:
      terminationGracePeriodSeconds: 60 # 若 terminationDrainDuration 超时 30s 则显式指定 terminationGracePeriodSeconds
      containers:
      - name: nginx
        image: "nginx"

也可以使用网络监测的方式，需要配置到全局的configmap

$ kubectl -n istio-system edit configmap istio-sidecar-injector
# 在value的global.proxy中添加
          "lifecycle": {
            "preStop": {
              "exec": {
                "command": ["/bin/sh", "-c", "while [ $(netstat -plunt | grep tcp | grep -v envoy | wc -l | xargs) -ne 0 ]; do sleep 1; done"]
              },
            },
          },

火眼征信大数据工程师闫大佬