怎么对kubernetes scheduler进行二次开发


通过新增 Predicates Priorities Policies 来扩展 default scheduler 新增 Predicate Policy

predicate Interface

// FitPredicate is a function that indicates if a pod fits into an existing node.
// The failure information is given by the error.
type FitPredicate func(pod *v1.Pod, meta interface{}, nodeInfo *schedulercache.NodeInfo) (bool, []PredicateFailureReason, error)

Implement a predicate func

func PodFitsHostNew(pod *v1.Pod, meta interface{}, nodeInfo *schedulercache.NodeInfo) (bool, []algorithm.PredicateFailureReason, error) {if len(pod.Spec.NodeName) == 0 {
 return true, nil, nil
 node := nodeInfo.Node()
 if node == nil {return false, nil, fmt.Errorf( node not found)
 if pod.Spec.NodeName == node.Name {
 return true, nil, nil
 return false, []algorithm.PredicateFailureReason{ErrPodNotMatchHostName}, nil

register the custom predicate policy with a custom name

func init() {factory.RegisterAlgorithmProvider(factory.DefaultProvider, defaultPredicates(), defaultPriorities())
 // Cluster autoscaler friendly scheduling algorithm.
 factory.RegisterAlgorithmProvider(ClusterAutoscalerProvider, defaultPredicates(),
 copyAndReplace(defaultPriorities(),  LeastRequestedPriority ,  MostRequestedPriority ))
 factory.RegisterFitPredicate(CustomPredicatePolicy , predicates.PodFitsHostNew)

rebuild kube-scheduler and restart with flag of –policy-config-file

kube-scheduler xxxx –policy-config-file=/var/lib/kube-scheduler/policy.config

the content of –policy-config-file specified file

 kind  :  Policy ,
 apiVersion  :  v1 ,
 predicates  : [ { name  :  CustomPredicatePolicy}
 priorities  : [ ]

新增 Priority Policy

Priority Interface

// PriorityMapFunction is a function that computes per-node results for a given node.
type PriorityMapFunction func(pod *v1.Pod, meta interface{}, nodeInfo *schedulercache.NodeInfo) (schedulerapi.HostPriority, error)

 kind  :  Policy ,
 apiVersion  :  v1 ,
 predicates  : [ ],
 priorities  : [ { name  :  CumtomPriorityPolicy ,  weight  : 1}

新增 custom scheduler,pod 指定 scheduler-name 进行调度

A custom scheduler can be written in any language and can be as simple or complex as you need.

Specify the“scheduleName”in pod.spec

apiVersion: v1
kind: Pod
 name: nginx
 app: nginx
 schedulerName: my-scheduler
 - name: nginx
 image: nginx:1.10

Here is a very simple example of a custom scheduler written in Bash that assigns a node randomly. Note that you need to run this along with kubectl proxy for it to work.

kubectl proxy –port=8001

SERVER= localhost:8001 
while true;
 for PODNAME in $(kubectl --server $SERVER get pods -o json | jq  .items[] | select(.spec.schedulerName ==  my-scheduler) | select(.spec.nodeName == null) | .metadata.name  | tr -d  )
 NODES=($(kubectl --server $SERVER get nodes -o json | jq  .items[].metadata.name  | tr -d  ))
 curl --header  Content-Type:application/json  --request POST --data  {apiVersion : v1 ,  kind :  Binding ,  metadata : { name :  $PODNAME},  target : { apiVersion :  v1 ,  kind 
:  Node ,  name :  $CHOSEN }}  http://$SERVER/api/v1/namespaces/default/pods/$PODNAME/binding/
 echo  Assigned $PODNAME to $CHOSEN 
 sleep 1

