Kubernetes PodGC Controller怎么配置

149次阅读

共计 7114 个字符，预计需要花费 18 分钟才能阅读完成。

本篇内容介绍了“Kubernetes PodGC Controller 怎么配置”的有关知识，在实际案例的操作过程中，不少人都会遇到这样的困境，接下来就让丸趣 TV 小编带领大家学习一下如何处理这些情况吧！希望大家仔细阅读，能够学有所成！

PodGC Controller 配置

关于 PodGC Controller 的相关配置（kube-controller-manager 配置），一共只有两个：

flagdefault valuecomments–controllers stringSlice* 这里配置需要 enable 的 controlllers 列表，podgc 当然也可以在这里设置是都要 enable or disable，默认 podgc 是在 enable 列表中的。–terminated-pod-gc-threshold int3212500Number of terminated pods that can exist before the terminated pod garbage collector starts deleting terminated pods. If = 0, the terminated pod garbage collector is disabled. (default 12500)PodGC Controller 入口

PodGC Controller 是在 kube-controller-manager Run 的时候启动的。CMServer Run 时会 invoke StartControllers 将预先注册的 enabled Controllers 遍历并逐个启动。

cmd/kube-controller-manager/app/controllermanager.go:180
func Run(s *options.CMServer) error {
 ...
 err := StartControllers(newControllerInitializers(), s, rootClientBuilder, clientBuilder, stop)
}

在 newControllerInitializers 注册了所有一些常规 Controllers 及其对应的 start 方法，为什么说这些是常规的 Controllers 呢，因为还有一部分 Controllers 没在这里进行注册，比如非常重要的 service Controller，node Controller 等，我把这些称为非常规 Controllers。

func newControllerInitializers() map[string]InitFunc {controllers := map[string]InitFunc{}
 controllers[endpoint] = startEndpointController
 controllers[podgc] = startPodGCController
 return controllers
}

因此 CMServer 最终是 invoke startPodGCController 来启动 PodGC Controller 的。

cmd/kube-controller-manager/app/core.go:66
func startPodGCController(ctx ControllerContext) (bool, error) {
 go podgc.NewPodGC(ctx.ClientBuilder.ClientOrDie( pod-garbage-collector),
 ctx.InformerFactory.Core().V1().Pods(),
 int(ctx.Options.TerminatedPodGCThreshold),
 ).Run(ctx.Stop)
 return true, nil
}

startPodGCController 内容很简单，启动一个 goruntine 协程，创建 PodGC 并启动执行。

PodGC Controller 的创建

我们先来看看 PodGCController 的定义。

pkg/controller/podgc/gc_controller.go:44
type PodGCController struct {
 kubeClient clientset.Interface
 podLister corelisters.PodLister
 podListerSynced cache.InformerSynced
 deletePod func(namespace, name string) error
 terminatedPodThreshold int
}

kubeClient: 用来跟 APIServer 通信的 client。

PodLister: PodLister helps list Pods.

podListerSynced: 用来判断 PodLister 是否 Has Synced。

deletePod: 调用 apiserver 删除对应 pod 的接口。

terminatedPodThreshold: 对应 –terminated-pod-gc-threshold 的配置，默认为 12500。

pkg/controller/podgc/gc_controller.go:54
func NewPodGC(kubeClient clientset.Interface, podInformer coreinformers.PodInformer, terminatedPodThreshold int) *PodGCController {if kubeClient != nil   kubeClient.Core().RESTClient().GetRateLimiter() != nil {metrics.RegisterMetricAndTrackRateLimiterUsage( gc_controller , kubeClient.Core().RESTClient().GetRateLimiter())
 gcc :=  PodGCController{
 kubeClient: kubeClient,
 terminatedPodThreshold: terminatedPodThreshold,
 deletePod: func(namespace, name string) error {glog.Infof( PodGC is force deleting Pod: %v:%v , namespace, name)
 return kubeClient.Core().Pods(namespace).Delete(name, metav1.NewDeleteOptions(0))
 gcc.podLister = podInformer.Lister()
 gcc.podListerSynced = podInformer.Informer().HasSynced
 return gcc
}

创建 PodGC Controller 时其实只是把相关的 PodGCController 元素进行赋值。注意 deletePod 方法定义时的参数 metav1.NewDeleteOptions(0)，表示立即删除 pod，没有 grace period。

PodGC Controller 的运行

创建完 PodGC Controller 后，接下来就是执行 Run 方法启动执行了。

pkg/controller/podgc/gc_controller.go:73
func (gcc *PodGCController) Run(stop  -chan struct{}) {if !cache.WaitForCacheSync(stop, gcc.podListerSynced) {utilruntime.HandleError(fmt.Errorf( timed out waiting for caches to sync))
 return
 go wait.Until(gcc.gc, gcCheckPeriod, stop)
 -stop
}

每 100ms 都会去检查对应的 PodLister 是否 Has Synced，直到 Has Synced。

启动 goruntine 协程，每执行完一次 gcc.gc 进行 Pod 回收后，等待 20s，再次执行 gcc.gc，直到收到 stop 信号。

pkg/controller/podgc/gc_controller.go:83
func (gcc *PodGCController) gc() {pods, err := gcc.podLister.List(labels.Everything())
 if err != nil {glog.Errorf( Error while listing all Pods: %v , err)
 return
 if gcc.terminatedPodThreshold   0 {gcc.gcTerminated(pods)
 gcc.gcOrphaned(pods)
 gcc.gcUnscheduledTerminating(pods)
}

gcc.gc 是最终的 pod 回收逻辑：

调从 PodLister 中去除所有的 pods（不设置过滤）

如果 terminatedPodThreshold 大于 0，则调用 gcc.gcTerminated(pods)回收那些超出 Threshold 的 Pods。

调用 gcc.gcOrphaned(pods)回收 Orphaned pods。

调用 gcc.gcUnscheduledTerminating(pods)回收 UnscheduledTerminating pods。

注意：

gcTerminated 和 gcOrphaned，gcUnscheduledTerminating 这三个 gc 都是串行执行的。

gcTerminated 删除超出阈值的 pods 的删除动作是并行的，通过 sync.WaitGroup 等待所有对应的 pods 删除完成后，gcTerminated 才会结束返回，才能开始后面的 gcOrphaned.

gcOrphaned，gcUnscheduledTerminatin，gcUnscheduledTerminatin 内部都是串行 gc pods 的。

回收那些 Terminated 的 pods

func (gcc *PodGCController) gcTerminated(pods []*v1.Pod) {terminatedPods := []*v1.Pod{}
 for _, pod := range pods {if isPodTerminated(pod) {terminatedPods = append(terminatedPods, pod)
 terminatedPodCount := len(terminatedPods)
 sort.Sort(byCreationTimestamp(terminatedPods))
 deleteCount := terminatedPodCount - gcc.terminatedPodThreshold
 if deleteCount   terminatedPodCount {
 deleteCount = terminatedPodCount
 if deleteCount   0 {glog.Infof( garbage collecting %v pods , deleteCount)
 var wait sync.WaitGroup
 for i := 0; i   deleteCount; i++ {wait.Add(1)
 go func(namespace string, name string) {defer wait.Done()
 if err := gcc.deletePod(namespace, name); err != nil {
 // ignore not founds
 defer utilruntime.HandleError(err)
 }(terminatedPods[i].Namespace, terminatedPods[i].Name)
 wait.Wait()}

遍历所有 pods，过滤出所有 Terminated Pods（Pod.Status.Phase 不为 Pending, Running, Unknow 的 Pods）.

计算 terminated pods 数与 terminatedPodThreshold 的 (超出) 差值 deleteCount。

启动 deleteCount 数量的 goruntine 协程，并行调用 gcc.deletePod（invoke apiserver s api）方法立刻删除对应的 pod。

回收那些 Binded 的 Nodes 已经不存在的 pods

// gcOrphaned deletes pods that are bound to nodes that don t exist.
func (gcc *PodGCController) gcOrphaned(pods []*v1.Pod) {glog.V(4).Infof(GC ing orphaned)
 // We want to get list of Nodes from the etcd, to make sure that it s as fresh as possible.
 nodes, err := gcc.kubeClient.Core().Nodes().List(metav1.ListOptions{})
 if err != nil {
 return
 nodeNames := sets.NewString()
 for i := range nodes.Items {nodeNames.Insert(nodes.Items[i].Name)
 for _, pod := range pods {
 if pod.Spec.NodeName ==   {
 continue
 if nodeNames.Has(pod.Spec.NodeName) {
 continue
 glog.V(2).Infof(Found orphaned Pod %v assigned to the Node %v. Deleting. , pod.Name, pod.Spec.NodeName)
 if err := gcc.deletePod(pod.Namespace, pod.Name); err != nil {utilruntime.HandleError(err)
 } else {glog.V(0).Infof(Forced deletion of orphaned Pod %s succeeded , pod.Name)
}

gcOrphaned 用来删除那些 bind 的 node 已经不存在的 pods。

调用 apiserver 接口，获取所有的 Nodes。

遍历所有 pods，如果 pod bind 的 NodeName 不为空且不包含在刚刚获取的所有 Nodes 中，则串行逐个调用 gcc.deletePod 删除对应的 pod。

回收 Unscheduled 并且 Terminating 的 pods

pkg/controller/podgc/gc_controller.go:167
// gcUnscheduledTerminating deletes pods that are terminating and haven t been scheduled to a particular node.
func (gcc *PodGCController) gcUnscheduledTerminating(pods []*v1.Pod) {glog.V(4).Infof(GC ing unscheduled pods which are terminating.)
 for _, pod := range pods {if pod.DeletionTimestamp == nil || len(pod.Spec.NodeName)   0 {
 continue
 glog.V(2).Infof(Found unscheduled terminating Pod %v not assigned to any Node. Deleting. , pod.Name)
 if err := gcc.deletePod(pod.Namespace, pod.Name); err != nil {utilruntime.HandleError(err)
 } else {glog.V(0).Infof(Forced deletion of unscheduled terminating Pod %s succeeded , pod.Name)
}

gcUnscheduledTerminating 删除那些 terminating 并且还没调度到某个 node 的 pods。

遍历所有 pods，过滤那些 terminating(pod.DeletionTimestamp != nil)并且未调度成功的 (pod.Spec.NodeName 为空) 的 pods。

串行逐个调用 gcc.deletePod 删除对应的 pod。

“Kubernetes PodGC Controller 怎么配置”的内容就介绍到这里了，感谢大家的阅读。如果想了解更多行业相关的知识可以关注丸趣 TV 网站，丸趣 TV 小编将为大家输出更多高质量的实用文章！

正文完