<input id="ohw05"></input>
  • <table id="ohw05"><menu id="ohw05"></menu></table>
  • <var id="ohw05"></var>
  • <code id="ohw05"><cite id="ohw05"></cite></code>
    <label id="ohw05"></label>
    <var id="ohw05"></var>
  • kubernetes/k8s CNI分析-容器網絡接口分析

    關聯博客:kubernetes/k8s CSI分析-容器存儲接口分析
    kubernetes/k8s CRI分析-容器運行時接口分析

    概述

    kubernetes的設計初衷是支持可插拔架構,從而利于擴展kubernetes的功能。在此架構思想下,kubernetes提供了3個特定功能的接口,分別是容器網絡接口CNI、容器運行時接口CRI和容器存儲接口CSI。kubernetes通過調用這幾個接口,來完成相應的功能。

    下面我們來對容器運行時接口CNI來做一下介紹與分析。

    CNI是什么

    CNI,全稱是 Container Network Interface,即容器網絡接口。

    CNI是K8s 中標準的調用網絡實現的接口。Kubelet 通過這個標準的接口來調用不同的網絡插件以實現不同的網絡配置方式。

    CNI網絡插件是一個可執行文件,是遵守容器網絡接口(CNI)規范的網絡插件。常見的 CNI網絡插件包括 Calico、flannel、Terway、Weave Net等。

    當kubelet選擇使用CNI類型的網絡插件時(通過kubelet啟動參數指定),kubelet在創建pod、刪除pod的時候,會調用CNI網絡插件來做pod的構建網絡和銷毀網絡等操作。

    kubelet的網絡插件

    kubelet的網絡插件有以下3種類型:
    (1)CNI;
    (2)kubenet;
    (3)Noop,代表不配置網絡插件。

    這里主要對kubelet中CNI相關的源碼進行分析。

    CNI架構

    kubelet創建/刪除pod時,會調用CRI,然后CRI會調用CNI來進行pod網絡的構建/刪除。

    kubelet構建pod網絡的大致過程

    (1)kubelet先通過CRI創建pause容器(pod sandbox),生成network namespace;
    (2)kubelet根據啟動參數配置調用具體的網絡插件如CNI網絡插件;
    (3)網絡插件給pause容器(pod sandbox)配置網絡;
    (4)pod 中其他的容器都與pause容器(pod sandbox)共享網絡。

    kubelet中cni相關的源碼分析

    kubelet的cni源碼分析包括如下幾部分:
    (1)cni相關啟動參數分析;
    (2)關鍵struct/interface分析;
    (3)cni初始化分析;
    (4)cni構建pod網絡分析;
    (5)cni銷毀pod網絡分析。

    基于tag v1.17.4

    https://github.com/kubernetes/kubernetes/releases/tag/v1.17.4

    1.kubelet組件cni相關啟動參數分析

    kubelet組件cni相關啟動參數相關代碼如下:

    // pkg/kubelet/config/flags.go
    func (s *ContainerRuntimeOptions) AddFlags(fs *pflag.FlagSet) {
        ...
        // Network plugin settings for Docker.
    	fs.StringVar(&s.NetworkPluginName, "network-plugin", s.NetworkPluginName, fmt.Sprintf("<Warning: Alpha feature> The name of the network plugin to be invoked for various events in kubelet/pod lifecycle. %s", dockerOnlyWarning))
    	fs.StringVar(&s.CNIConfDir, "cni-conf-dir", s.CNIConfDir, fmt.Sprintf("<Warning: Alpha feature> The full path of the directory in which to search for CNI config files. %s", dockerOnlyWarning))
    	fs.StringVar(&s.CNIBinDir, "cni-bin-dir", s.CNIBinDir, fmt.Sprintf("<Warning: Alpha feature> A comma-separated list of full paths of directories in which to search for CNI plugin binaries. %s", dockerOnlyWarning))
    	fs.StringVar(&s.CNICacheDir, "cni-cache-dir", s.CNICacheDir, fmt.Sprintf("<Warning: Alpha feature> The full path of the directory in which CNI should store cache files. %s", dockerOnlyWarning))
    	fs.Int32Var(&s.NetworkPluginMTU, "network-plugin-mtu", s.NetworkPluginMTU, fmt.Sprintf("<Warning: Alpha feature> The MTU to be passed to the network plugin, to override the default. Set to 0 to use the default 1460 MTU. %s", dockerOnlyWarning))
        ...
    }
    

    cni相關啟動參數的默認值在NewContainerRuntimeOptions函數中設置。

    // cmd/kubelet/app/options/container_runtime.go
    // NewContainerRuntimeOptions will create a new ContainerRuntimeOptions with
    // default values.
    func NewContainerRuntimeOptions() *config.ContainerRuntimeOptions {
    	dockerEndpoint := ""
    	if runtime.GOOS != "windows" {
    		dockerEndpoint = "unix:///var/run/docker.sock"
    	}
    
    	return &config.ContainerRuntimeOptions{
    		ContainerRuntime:           kubetypes.DockerContainerRuntime,
    		RedirectContainerStreaming: false,
    		DockerEndpoint:             dockerEndpoint,
    		DockershimRootDirectory:    "/var/lib/dockershim",
    		PodSandboxImage:            defaultPodSandboxImage,
    		ImagePullProgressDeadline:  metav1.Duration{Duration: 1 * time.Minute},
    		ExperimentalDockershim:     false,
    
    		//Alpha feature
    		CNIBinDir:   "/opt/cni/bin",
    		CNIConfDir:  "/etc/cni/net.d",
    		CNICacheDir: "/var/lib/cni/cache",
    	}
    }
    

    下面來簡單分析幾個比較重要的cni相關啟動參數:
    (1)--network-plugin:指定要使用的網絡插件類型,可選值cnikubenet"",默認為空串,代表Noop,即不配置網絡插件(不構建pod網絡)。此處配置值為cni時,即指定kubelet使用的網絡插件類型為cni

    (2)--cni-conf-dir:CNI 配置文件所在路徑。默認值:/etc/cni/net.d

    (3)--cni-bin-dir:CNI 插件的可執行文件所在路徑,kubelet 將在此路徑中查找 CNI 插件的可執行文件來執行pod的網絡操作。默認值:/opt/cni/bin

    2.關鍵struct/interface分析

    interface NetworkPlugin

    先來看下關鍵的interface:NetworkPlugin

    NetworkPlugin interface聲明了kubelet網絡插件的一些操作方法,不同類型的網絡插件只需要實現這些方法即可,其中最關鍵的就是SetUpPodTearDownPod方法,作用分別是構建pod網絡與銷毀pod網絡,cniNetworkPlugin實現了該interface。

    // pkg/kubelet/dockershim/network/plugins.go
    // NetworkPlugin is an interface to network plugins for the kubelet
    type NetworkPlugin interface {
    	// Init initializes the plugin.  This will be called exactly once
    	// before any other methods are called.
    	Init(host Host, hairpinMode kubeletconfig.HairpinMode, nonMasqueradeCIDR string, mtu int) error
    
    	// Called on various events like:
    	// NET_PLUGIN_EVENT_POD_CIDR_CHANGE
    	Event(name string, details map[string]interface{})
    
    	// Name returns the plugin's name. This will be used when searching
    	// for a plugin by name, e.g.
    	Name() string
    
    	// Returns a set of NET_PLUGIN_CAPABILITY_*
    	Capabilities() utilsets.Int
    
    	// SetUpPod is the method called after the infra container of
    	// the pod has been created but before the other containers of the
    	// pod are launched.
    	SetUpPod(namespace string, name string, podSandboxID kubecontainer.ContainerID, annotations, options map[string]string) error
    
    	// TearDownPod is the method called before a pod's infra container will be deleted
    	TearDownPod(namespace string, name string, podSandboxID kubecontainer.ContainerID) error
    
    	// GetPodNetworkStatus is the method called to obtain the ipv4 or ipv6 addresses of the container
    	GetPodNetworkStatus(namespace string, name string, podSandboxID kubecontainer.ContainerID) (*PodNetworkStatus, error)
    
    	// Status returns error if the network plugin is in error state
    	Status() error
    }
    

    struct cniNetworkPlugin

    cniNetworkPlugin struct實現了NetworkPlugin interface,實現了SetUpPodTearDownPod等方法。

    // pkg/kubelet/dockershim/network/cni/cni.go
    type cniNetworkPlugin struct {
    	network.NoopNetworkPlugin
    
    	loNetwork *cniNetwork
    
    	sync.RWMutex
    	defaultNetwork *cniNetwork
    
    	host        network.Host
    	execer      utilexec.Interface
    	nsenterPath string
    	confDir     string
    	binDirs     []string
    	cacheDir    string
    	podCidr     string
    }
    

    struct PluginManager

    struct PluginManager中的plugin屬性是interface NetworkPlugin類型,可以傳入具體的網絡插件實現,如cniNetworkPlugin struct

    // pkg/kubelet/dockershim/network/plugins.go
    // The PluginManager wraps a kubelet network plugin and provides synchronization
    // for a given pod's network operations.  Each pod's setup/teardown/status operations
    // are synchronized against each other, but network operations of other pods can
    // proceed in parallel.
    type PluginManager struct {
    	// Network plugin being wrapped
    	plugin NetworkPlugin
    
    	// Pod list and lock
    	podsLock sync.Mutex
    	pods     map[string]*podLock
    }
    

    struct dockerService

    struct dockerService其實在CRI分析的博文部分有做過詳細分析,可以去回顧一下,下面再簡單做一下介紹。

    struct dockerService實現了CRI shim服務端的容器運行時接口以及容器鏡像接口,所以其代表了dockershim(kubelet內置的CRI shim)的服務端。

    struct dockerService中的network屬性是struct PluginManager類型,在該結構體初始化時會將具體的網絡插件結構體如struct cniNetworkPlugin存儲進該屬性。

    創建pod、刪除pod時會根據dockerService結構體的network屬性里面存儲的具體的網絡插件結構體,去調用某個具體網絡插件(如cniNetworkPlugin)的SetUpPodTearDownPod方法來構建pod的網絡、銷毀pod的網絡。

    // pkg/kubelet/dockershim/docker_service.go
    type dockerService struct {
    	client           libdocker.Interface
    	os               kubecontainer.OSInterface
    	podSandboxImage  string
    	streamingRuntime *streamingRuntime
    	streamingServer  streaming.Server
    
    	network *network.PluginManager
    	// Map of podSandboxID :: network-is-ready
    	networkReady     map[string]bool
    	networkReadyLock sync.Mutex
    
    	containerManager cm.ContainerManager
    	// cgroup driver used by Docker runtime.
    	cgroupDriver      string
    	checkpointManager checkpointmanager.CheckpointManager
    	// caches the version of the runtime.
    	// To be compatible with multiple docker versions, we need to perform
    	// version checking for some operations. Use this cache to avoid querying
    	// the docker daemon every time we need to do such checks.
    	versionCache *cache.ObjectCache
    	// startLocalStreamingServer indicates whether dockershim should start a
    	// streaming server on localhost.
    	startLocalStreamingServer bool
    
    	// containerCleanupInfos maps container IDs to the `containerCleanupInfo` structs
    	// needed to clean up after containers have been removed.
    	// (see `applyPlatformSpecificDockerConfig` and `performPlatformSpecificContainerCleanup`
    	// methods for more info).
    	containerCleanupInfos map[string]*containerCleanupInfo
    }
    

    3.cni初始化分析

    Kubelet 啟動過程中針對網絡主要做以下步驟,分別是探針獲取當前環境的網絡插件以及初始化網絡插件(只有當容器運行時選擇為內置dockershim時,才會做CNI的初始化操作,將CNI初始化完成后交給dockershim使用)。

    cni初始化的調用鏈:
    main (cmd/kubelet/kubelet.go)
    -> NewKubeletCommand (cmd/kubelet/app/server.go)
    -> Run (cmd/kubelet/app/server.go)
    -> run (cmd/kubelet/app/server.go)
    -> RunKubelet (cmd/kubelet/app/server.go)
    -> CreateAndInitKubelet(cmd/kubelet/app/server.go)
    -> kubelet.NewMainKubelet(pkg/kubelet/kubelet.go)
    -> cni.ProbeNetworkPlugins & network.InitNetworkPlugin(pkg/kubelet/network/plugins.go)

    調用鏈很長,這里直接進入關鍵的函數NewMainKubelet進行分析。

    NewMainKubelet

    NewMainKubelet函數中主要看到dockershim.NewDockerService調用。

    // pkg/kubelet/kubelet.go
    // NewMainKubelet instantiates a new Kubelet object along with all the required internal modules.
    // No initialization of Kubelet and its modules should happen here.
    func NewMainKubelet(kubeCfg *kubeletconfiginternal.KubeletConfiguration,...) {
        ...
        switch containerRuntime {
    	case kubetypes.DockerContainerRuntime:
    		// Create and start the CRI shim running as a grpc server.
    		streamingConfig := getStreamingConfig(kubeCfg, kubeDeps, crOptions)
    		ds, err := dockershim.NewDockerService(kubeDeps.DockerClientConfig, crOptions.PodSandboxImage, streamingConfig,
    			&pluginSettings, runtimeCgroups, kubeCfg.CgroupDriver, crOptions.DockershimRootDirectory, !crOptions.RedirectContainerStreaming)
        ...
    }
    

    這里對變量containerRuntime值等于docker時做分析,即kubelet啟動參數--container-runtime值為docker,這時kubelet會使用內置的CRI shimdockershim作為容器運行時,初始化并啟動dockershim

    其中,調用dockershim.NewDockerService的作用是:新建并初始化dockershim服務端,包括初始化docker client、初始化cni網絡配置等操作。

    而其中CNI部分的主要邏輯為:
    (1)調用cni.ProbeNetworkPlugins:根據kubelet啟動參數cni相關配置,獲取cni配置文件、cni網絡插件可執行文件等信息,根據這些cni的相關信息來初始化cniNetworkPlugin結構體并返回;
    (2)調用network.InitNetworkPlugin:根據networkPluginName的值(對應kubelet啟動參數--network-plugin),選擇相應的網絡插件,調用其Init()方法,做網絡插件的初始化操作(初始化操作主要是起了一個goroutine,定時探測cni的配置文件以及可執行文件,讓其可以熱更新);
    (3)將上面步驟中獲取到的cniNetworkPlugin結構體,賦值給dockerService structnetwork屬性,待后續創建pod、刪除pod時可以調用cniNetworkPluginSetUpPodTearDownPod方法來構建pod的網絡、銷毀pod的網絡。

    kubelet對CNI的實現的主要代碼:pkg/kubelet/network/cni/cni.go-SetUpPod/TearDownPod(構建Pod網絡和銷毀Pod網絡)

    其中函數入參pluginSettings *NetworkPluginSettings的參數值,其實是從kubelet啟動參數配置而來,kubelet cni相關啟動參數在前面已經做了分析了,忘記的可以回頭看一下。

    // pkg/kubelet/dockershim/docker_service.go
    // NewDockerService creates a new `DockerService` struct.
    // NOTE: Anything passed to DockerService should be eventually handled in another way when we switch to running the shim as a different process.
    func NewDockerService(config *ClientConfig, podSandboxImage string, streamingConfig *streaming.Config, pluginSettings *NetworkPluginSettings,
    	cgroupsName string, kubeCgroupDriver string, dockershimRootDir string, startLocalStreamingServer bool, noJsonLogPath string) (DockerService, error) {
        ...
        ds := &dockerService{
    		client:          c,
    		os:              kubecontainer.RealOS{},
    		podSandboxImage: podSandboxImage,
    		streamingRuntime: &streamingRuntime{
    			client:      client,
    			execHandler: &NativeExecHandler{},
    		},
    		containerManager:          cm.NewContainerManager(cgroupsName, client),
    		checkpointManager:         checkpointManager,
    		startLocalStreamingServer: startLocalStreamingServer,
    		networkReady:              make(map[string]bool),
    		containerCleanupInfos:     make(map[string]*containerCleanupInfo),
    		noJsonLogPath:             noJsonLogPath,
    	}
    	...
        // dockershim currently only supports CNI plugins.
    	pluginSettings.PluginBinDirs = cni.SplitDirs(pluginSettings.PluginBinDirString)
    	// (1)根據kubelet啟動參數cni相關配置,獲取cni配置文件、cni網絡插件可執行文件等信息,根據這些cni的相關信息來初始化```cniNetworkPlugin```結構體并返回
    	cniPlugins := cni.ProbeNetworkPlugins(pluginSettings.PluginConfDir, pluginSettings.PluginCacheDir, pluginSettings.PluginBinDirs)
    	cniPlugins = append(cniPlugins, kubenet.NewPlugin(pluginSettings.PluginBinDirs, pluginSettings.PluginCacheDir))
    	netHost := &dockerNetworkHost{
    		&namespaceGetter{ds},
    		&portMappingGetter{ds},
    	}
    	// (2)根據networkPluginName的值(對應kubelet啟動參數```--network-plugin```),選擇相應的網絡插件,調用其```Init()```方法,做網絡插件的初始化操作(初始化操作主要是起了一個goroutine,定時探測cni的配置文件以及可執行文件,讓其可以熱更新)
    	plug, err := network.InitNetworkPlugin(cniPlugins, pluginSettings.PluginName, netHost, pluginSettings.HairpinMode, pluginSettings.NonMasqueradeCIDR, pluginSettings.MTU)
    	if err != nil {
    		return nil, fmt.Errorf("didn't find compatible CNI plugin with given settings %+v: %v", pluginSettings, err)
    	}
    	// (3)將上面步驟中獲取到的```cniNetworkPlugin```結構體,賦值給```dockerService struct```的```network```屬性,待后續創建pod、刪除pod時可以調用```cniNetworkPlugin```的```SetUpPod```、```TearDownPod```方法來構建pod的網絡、銷毀pod的網絡。  
    	ds.network = network.NewPluginManager(plug)
    	klog.Infof("Docker cri networking managed by %v", plug.Name())
        ...
    }
    

    先來看下pluginSettings長什么樣,其實是struct NetworkPluginSettings,包含了網絡插件名稱、網絡插件可執行文件所在目錄、網絡插件配置文件所在目錄等屬性,代碼如下:

    // pkg/kubelet/dockershim/docker_service.go
    type NetworkPluginSettings struct {
    	// HairpinMode is best described by comments surrounding the kubelet arg
    	HairpinMode kubeletconfig.HairpinMode
    	// NonMasqueradeCIDR is the range of ips which should *not* be included
    	// in any MASQUERADE rules applied by the plugin
    	NonMasqueradeCIDR string
    	// PluginName is the name of the plugin, runtime shim probes for
    	PluginName string
    	// PluginBinDirString is a list of directiores delimited by commas, in
    	// which the binaries for the plugin with PluginName may be found.
    	PluginBinDirString string
    	// PluginBinDirs is an array of directories in which the binaries for
    	// the plugin with PluginName may be found. The admin is responsible for
    	// provisioning these binaries before-hand.
    	PluginBinDirs []string
    	// PluginConfDir is the directory in which the admin places a CNI conf.
    	// Depending on the plugin, this may be an optional field, eg: kubenet
    	// generates its own plugin conf.
    	PluginConfDir string
    	// PluginCacheDir is the directory in which CNI should store cache files.
    	PluginCacheDir string
    	// MTU is the desired MTU for network devices created by the plugin.
    	MTU int
    }
    

    3.1 cni.ProbeNetworkPlugins

    cni.ProbeNetworkPlugins中主要作用為:根據kubelet啟動參數cni相關配置,獲取cni配置文件、cni網絡插件可執行文件等信息,根據這些cni的相關信息來初始化cniNetworkPlugin結構體并返回。

    其中看到plugin.syncNetworkConfig()調用,主要作用是給cniNetworkPlugin結構體的defaultNetwork屬性賦值。

    // pkg/kubelet/dockershim/network/cni/cni.go
    // ProbeNetworkPlugins : get the network plugin based on cni conf file and bin file
    func ProbeNetworkPlugins(confDir, cacheDir string, binDirs []string) []network.NetworkPlugin {
    	old := binDirs
    	binDirs = make([]string, 0, len(binDirs))
    	for _, dir := range old {
    		if dir != "" {
    			binDirs = append(binDirs, dir)
    		}
    	}
    
    	plugin := &cniNetworkPlugin{
    		defaultNetwork: nil,
    		loNetwork:      getLoNetwork(binDirs),
    		execer:         utilexec.New(),
    		confDir:        confDir,
    		binDirs:        binDirs,
    		cacheDir:       cacheDir,
    	}
    
    	// sync NetworkConfig in best effort during probing.
    	plugin.syncNetworkConfig()
    	return []network.NetworkPlugin{plugin}
    }
    
    plugin.syncNetworkConfig()

    主要邏輯:
    (1)getDefaultCNINetwork():根據kubelet啟動參數配置,去對應的cni conf文件夾下尋找cni配置文件,返回包含cni信息的cniNetwork結構體;
    (2)plugin.setDefaultNetwork():根據上一步獲取到的cniNetwork結構體,賦值給cniNetworkPlugin結構體的defaultNetwork屬性。

    // pkg/kubelet/dockershim/network/cni/cni.go
    func (plugin *cniNetworkPlugin) syncNetworkConfig() {
    	network, err := getDefaultCNINetwork(plugin.confDir, plugin.binDirs)
    	if err != nil {
    		klog.Warningf("Unable to update cni config: %s", err)
    		return
    	}
    	plugin.setDefaultNetwork(network)
    }
    
    getDefaultCNINetwork()

    主要邏輯:
    (1)在cni配置文件所在目錄下,可以識別3種cni配置文件,分別是.conf, .conflist, .json

    (2)調用sort.Strings()將cni配置文件所在目錄下的所有cni配置文件按照字典順序升序排序。

    (3)只取第一個讀取到的cni配置文件,然后直接return。所以就算在cni配置文件目錄下配置了多個cni配置文件,也只會有其中一個最終生效。

    (4)調用cniConfig.ValidateNetworkList(),校驗cni可執行文件目錄下是否存在對應的可執行文件。

    // pkg/kubelet/dockershim/network/cni/cni.go
    func getDefaultCNINetwork(confDir string, binDirs []string) (*cniNetwork, error) {
    	files, err := libcni.ConfFiles(confDir, []string{".conf", ".conflist", ".json"})
    	switch {
    	case err != nil:
    		return nil, err
    	case len(files) == 0:
    		return nil, fmt.Errorf("no networks found in %s", confDir)
    	}
    
    	cniConfig := &libcni.CNIConfig{Path: binDirs}
    
    	sort.Strings(files)
    	for _, confFile := range files {
    		var confList *libcni.NetworkConfigList
    		if strings.HasSuffix(confFile, ".conflist") {
    			confList, err = libcni.ConfListFromFile(confFile)
    			if err != nil {
    				klog.Warningf("Error loading CNI config list file %s: %v", confFile, err)
    				continue
    			}
    		} else {
    			conf, err := libcni.ConfFromFile(confFile)
    			if err != nil {
    				klog.Warningf("Error loading CNI config file %s: %v", confFile, err)
    				continue
    			}
    			// Ensure the config has a "type" so we know what plugin to run.
    			// Also catches the case where somebody put a conflist into a conf file.
    			if conf.Network.Type == "" {
    				klog.Warningf("Error loading CNI config file %s: no 'type'; perhaps this is a .conflist?", confFile)
    				continue
    			}
    
    			confList, err = libcni.ConfListFromConf(conf)
    			if err != nil {
    				klog.Warningf("Error converting CNI config file %s to list: %v", confFile, err)
    				continue
    			}
    		}
    		if len(confList.Plugins) == 0 {
    			klog.Warningf("CNI config list %s has no networks, skipping", string(confList.Bytes[:maxStringLengthInLog(len(confList.Bytes))]))
    			continue
    		}
    
    		// Before using this CNI config, we have to validate it to make sure that
    		// all plugins of this config exist on disk
    		caps, err := cniConfig.ValidateNetworkList(context.TODO(), confList)
    		if err != nil {
    			klog.Warningf("Error validating CNI config list %s: %v", string(confList.Bytes[:maxStringLengthInLog(len(confList.Bytes))]), err)
    			continue
    		}
    
    		klog.V(4).Infof("Using CNI configuration file %s", confFile)
    
    		return &cniNetwork{
    			name:          confList.Name,
    			NetworkConfig: confList,
    			CNIConfig:     cniConfig,
    			Capabilities:  caps,
    		}, nil
    	}
    	return nil, fmt.Errorf("no valid networks found in %s", confDir)
    }
    
    plugin.setDefaultNetwork

    將上面獲取到的cniNetwork結構體賦值給cniNetworkPlugin結構體的defaultNetwork屬性。

    // pkg/kubelet/dockershim/network/cni/cni.go
    func (plugin *cniNetworkPlugin) setDefaultNetwork(n *cniNetwork) {
    	plugin.Lock()
    	defer plugin.Unlock()
    	plugin.defaultNetwork = n
    }
    

    3.2 network.InitNetworkPlugin

    network.InitNetworkPlugin()主要作用:根據networkPluginName的值(對應kubelet啟動參數--network-plugin),選擇相應的網絡插件,調用其Init()方法,做網絡插件的初始化操作。

    // pkg/kubelet/dockershim/network/plugins.go
    // InitNetworkPlugin inits the plugin that matches networkPluginName. Plugins must have unique names.
    func InitNetworkPlugin(plugins []NetworkPlugin, networkPluginName string, host Host, hairpinMode kubeletconfig.HairpinMode, nonMasqueradeCIDR string, mtu int) (NetworkPlugin, error) {
    	if networkPluginName == "" {
    		// default to the no_op plugin
    		plug := &NoopNetworkPlugin{}
    		plug.Sysctl = utilsysctl.New()
    		if err := plug.Init(host, hairpinMode, nonMasqueradeCIDR, mtu); err != nil {
    			return nil, err
    		}
    		return plug, nil
    	}
    
    	pluginMap := map[string]NetworkPlugin{}
    
    	allErrs := []error{}
    	for _, plugin := range plugins {
    		name := plugin.Name()
    		if errs := validation.IsQualifiedName(name); len(errs) != 0 {
    			allErrs = append(allErrs, fmt.Errorf("network plugin has invalid name: %q: %s", name, strings.Join(errs, ";")))
    			continue
    		}
    
    		if _, found := pluginMap[name]; found {
    			allErrs = append(allErrs, fmt.Errorf("network plugin %q was registered more than once", name))
    			continue
    		}
    		pluginMap[name] = plugin
    	}
    
    	chosenPlugin := pluginMap[networkPluginName]
    	if chosenPlugin != nil {
    		err := chosenPlugin.Init(host, hairpinMode, nonMasqueradeCIDR, mtu)
    		if err != nil {
    			allErrs = append(allErrs, fmt.Errorf("network plugin %q failed init: %v", networkPluginName, err))
    		} else {
    			klog.V(1).Infof("Loaded network plugin %q", networkPluginName)
    		}
    	} else {
    		allErrs = append(allErrs, fmt.Errorf("network plugin %q not found", networkPluginName))
    	}
    
    	return chosenPlugin, utilerrors.NewAggregate(allErrs)
    }
    
    chosenPlugin.Init()

    當kubelet啟動參數--network-plugin的值配置為cni時,會調用到cniNetworkPluginInit()方法,代碼如下。

    啟動一個goroutine,每隔5秒,調用一次plugin.syncNetworkConfig。再來回憶一下plugin.syncNetworkConfig()的作用:根據kubelet啟動參數配置,去對應的cni conf文件夾下尋找cni配置文件,返回包含cni信息的cniNetwork結構體,賦值給cniNetworkPlugin結構體的defaultNetwork屬性,從而達到cni conf以及bin更新后,kubelet也能感知并更新cniNetworkPlugin結構體的效果。

    此處也可以看出該goroutine存在的意義,讓cni的配置文件以及可執行文件等可以熱更新,而無需重啟kubelet。

    // pkg/kubelet/dockershim/network/cni/cni.go
    func (plugin *cniNetworkPlugin) Init(host network.Host, hairpinMode kubeletconfig.HairpinMode, nonMasqueradeCIDR string, mtu int) error {
    	err := plugin.platformInit()
    	if err != nil {
    		return err
    	}
    
    	plugin.host = host
    
    	plugin.syncNetworkConfig()
    
    	// start a goroutine to sync network config from confDir periodically to detect network config updates in every 5 seconds
    	go wait.Forever(plugin.syncNetworkConfig, defaultSyncConfigPeriod)
    
    	return nil
    }
    

    plugin.platformInit()只是檢查了下是否有nsenter,沒有做其他操作。

    // pkg/kubelet/dockershim/network/cni/cni_others.go
    func (plugin *cniNetworkPlugin) platformInit() error {
    	var err error
    	plugin.nsenterPath, err = plugin.execer.LookPath("nsenter")
    	if err != nil {
    		return err
    	}
    	return nil
    }
    

    4.CNI構建pod網絡分析

    kubelet創建pod時,通過CRI創建并啟動pod sandbox,然后CRI會調用CNI網絡插件構建pod網絡。

    kubelet中CNI構建pod網絡的方法是:pkg/kubelet/network/cni/cni.go-SetUpPod

    其中SetUpPod方法的調用鏈如下(只列出了關鍵部分):
    main (cmd/kubelet/kubelet.go)
    ...
    -> klet.syncPod(pkg/kubelet/kubelet.go)
    -> kl.containerRuntime.SyncPod(pkg/kubelet/kubelet.go)
    -> m.createPodSandbox(pkg/kubelet/kuberuntime/kuberuntime_manager.go)
    -> m.runtimeService.RunPodSandbox (pkg/kubelet/kuberuntime/kuberuntime_sandbox.go)
    -> ds.network.SetUpPod(pkg/kubelet/dockershim/docker_sandbox.go)
    -> pm.plugin.SetUpPod(pkg/kubelet/dockershim/network/plugins.go)
    -> SetUpPod(pkg/kubelet/dockershim/network/cni/cni.go)

    下面的代碼只是列出來看一下關鍵方法cniNetworkPlugin.SetUpPod()的調用鏈,不做具體分析。

    // pkg/kubelet/kuberuntime/kuberuntime_manager.go
    func (m *kubeGenericRuntimeManager) SyncPod(pod *v1.Pod, podStatus *kubecontainer.PodStatus, pullSecrets []v1.Secret, backOff *flowcontrol.Backoff) (result kubecontainer.PodSyncResult) {
    	...
    	podSandboxID, msg, err = m.createPodSandbox(pod, podContainerChanges.Attempt)
    	...
    }
    
    // pkg/kubelet/kuberuntime/kuberuntime_sandbox.go
    // createPodSandbox creates a pod sandbox and returns (podSandBoxID, message, error).
    func (m *kubeGenericRuntimeManager) createPodSandbox(pod *v1.Pod, attempt uint32) (string, string, error) {
        ...
        podSandBoxID, err := m.runtimeService.RunPodSandbox(podSandboxConfig, runtimeHandler)
        ...
    }
    

    RunPodSandbox方法中可以看到,是先創建pod sandbox,然后啟動pod sandbox,然后才是給該pod sandbox構建網絡。

    // pkg/kubelet/dockershim/docker_sandbox.go
    func (ds *dockerService) RunPodSandbox(ctx context.Context, r *runtimeapi.RunPodSandboxRequest) (*runtimeapi.RunPodSandboxResponse, error) {
        ...
        createResp, err := ds.client.CreateContainer(*createConfig)
        ...
        err = ds.client.StartContainer(createResp.ID)
        ...
        err = ds.network.SetUpPod(config.GetMetadata().Namespace, config.GetMetadata().Name, cID, config.Annotations, networkOptions)
        ...
    }
    

    PluginManager.SetUpPod方法中可以看到,調用了pm.plugin.SetUpPod,前面介紹cni初始化的時候講過相關賦值初始化操作,這里會調用到cniNetworkPluginSetUpPod方法。

    // pkg/kubelet/dockershim/network/plugins.go
    func (pm *PluginManager) SetUpPod(podNamespace, podName string, id kubecontainer.ContainerID, annotations, options map[string]string) error {
    	defer recordOperation("set_up_pod", time.Now())
    	fullPodName := kubecontainer.BuildPodFullName(podName, podNamespace)
    	pm.podLock(fullPodName).Lock()
    	defer pm.podUnlock(fullPodName)
    
    	klog.V(3).Infof("Calling network plugin %s to set up pod %q", pm.plugin.Name(), fullPodName)
    	if err := pm.plugin.SetUpPod(podNamespace, podName, id, annotations, options); err != nil {
    		return fmt.Errorf("networkPlugin %s failed to set up pod %q network: %v", pm.plugin.Name(), fullPodName, err)
    	}
    
    	return nil
    }
    

    cniNetworkPlugin.SetUpPod

    cniNetworkPlugin.SetUpPod方法作用cni網絡插件構建pod網絡的調用入口。其主要邏輯為:
    (1)調用plugin.checkInitialized():檢查網絡插件是否已經初始化完成;
    (2)調用plugin.host.GetNetNS():獲取容器網絡命名空間路徑,格式/proc/${容器PID}/ns/net
    (3)調用context.WithTimeout():設置調用cni網絡插件的超時時間;
    (3)調用plugin.addToNetwork():如果是linux環境,則調用cni網絡插件,給pod構建回環網絡;
    (4)調用plugin.addToNetwork():調用cni網絡插件,給pod構建默認網絡。

    // pkg/kubelet/dockershim/network/cni/cni.go
    func (plugin *cniNetworkPlugin) SetUpPod(namespace string, name string, id kubecontainer.ContainerID, annotations, options map[string]string) error {
    	if err := plugin.checkInitialized(); err != nil {
    		return err
    	}
    	netnsPath, err := plugin.host.GetNetNS(id.ID)
    	if err != nil {
    		return fmt.Errorf("CNI failed to retrieve network namespace path: %v", err)
    	}
    
    	// Todo get the timeout from parent ctx
    	cniTimeoutCtx, cancelFunc := context.WithTimeout(context.Background(), network.CNITimeoutSec*time.Second)
    	defer cancelFunc()
    	// Windows doesn't have loNetwork. It comes only with Linux
    	if plugin.loNetwork != nil {
    		if _, err = plugin.addToNetwork(cniTimeoutCtx, plugin.loNetwork, name, namespace, id, netnsPath, annotations, options); err != nil {
    			return err
    		}
    	}
    
    	_, err = plugin.addToNetwork(cniTimeoutCtx, plugin.getDefaultNetwork(), name, namespace, id, netnsPath, annotations, options)
    	return err
    }
    
    plugin.addToNetwork

    plugin.addToNetwork方法的作用就是調用cni網絡插件,給pod構建指定類型的網絡,其主要邏輯為:
    (1)調用plugin.buildCNIRuntimeConf():構建調用cni網絡插件的配置;
    (2)調用cniNet.AddNetworkList():調用cni網絡插件,進行網絡構建。

    // pkg/kubelet/dockershim/network/cni/cni.go
    func (plugin *cniNetworkPlugin) addToNetwork(ctx context.Context, network *cniNetwork, podName string, podNamespace string, podSandboxID kubecontainer.ContainerID, podNetnsPath string, annotations, options map[string]string) (cnitypes.Result, error) {
    	rt, err := plugin.buildCNIRuntimeConf(podName, podNamespace, podSandboxID, podNetnsPath, annotations, options)
    	if err != nil {
    		klog.Errorf("Error adding network when building cni runtime conf: %v", err)
    		return nil, err
    	}
    
    	pdesc := podDesc(podNamespace, podName, podSandboxID)
    	netConf, cniNet := network.NetworkConfig, network.CNIConfig
    	klog.V(4).Infof("Adding %s to network %s/%s netns %q", pdesc, netConf.Plugins[0].Network.Type, netConf.Name, podNetnsPath)
    	res, err := cniNet.AddNetworkList(ctx, netConf, rt)
    	if err != nil {
    		klog.Errorf("Error adding %s to network %s/%s: %v", pdesc, netConf.Plugins[0].Network.Type, netConf.Name, err)
    		return nil, err
    	}
    	klog.V(4).Infof("Added %s to network %s: %v", pdesc, netConf.Name, res)
    	return res, nil
    }
    
    cniNet.AddNetworkList

    AddNetworkList方法中主要是調用了addNetwork方法,所以來看下addNetwork方法的邏輯:
    (1)調用c.exec.FindInPath():拼接出cni網絡插件可執行文件的絕對路徑;
    (2)調用buildOneConfig():構建配置;
    (3)調用c.args():構建調用cni網絡插件的參數;
    (4)調用invoke.ExecPluginWithResult():調用cni網絡插件進行pod網絡的構建操作。

    // vendor/github.com/containernetworking/cni/libcni/api.go 
    func (c *CNIConfig) AddNetworkList(ctx context.Context, list *NetworkConfigList, rt *RuntimeConf) (types.Result, error) {
    	var err error
    	var result types.Result
    	for _, net := range list.Plugins {
    		result, err = c.addNetwork(ctx, list.Name, list.CNIVersion, net, result, rt)
    		if err != nil {
    			return nil, err
    		}
    	}
    
    	if err = setCachedResult(result, list.Name, rt); err != nil {
    		return nil, fmt.Errorf("failed to set network %q cached result: %v", list.Name, err)
    	}
    
    	return result, nil
    }
    
    func (c *CNIConfig) addNetwork(ctx context.Context, name, cniVersion string, net *NetworkConfig, prevResult types.Result, rt *RuntimeConf) (types.Result, error) {
    	c.ensureExec()
    	pluginPath, err := c.exec.FindInPath(net.Network.Type, c.Path)
    	if err != nil {
    		return nil, err
    	}
    
    	newConf, err := buildOneConfig(name, cniVersion, net, prevResult, rt)
    	if err != nil {
    		return nil, err
    	}
    
    	return invoke.ExecPluginWithResult(ctx, pluginPath, newConf.Bytes, c.args("ADD", rt), c.exec)
    }
    
    c.args

    c.args方法作用是構建調用cni網絡插件可執行文件時的參數。

    從代碼中可以看出,參數有Command(命令,Add代表構建網絡,Del代表銷毀網絡)、ContainerID(容器ID)、NetNS(容器網絡命名空間路徑)、IfName(Interface Name即網絡接口名稱)、PluginArgs(其他參數如pod名稱、pod命名空間等)等。

    // vendor/github.com/containernetworking/cni/libcni/api.go
    func (c *CNIConfig) args(action string, rt *RuntimeConf) *invoke.Args {
    	return &invoke.Args{
    		Command:     action,
    		ContainerID: rt.ContainerID,
    		NetNS:       rt.NetNS,
    		PluginArgs:  rt.Args,
    		IfName:      rt.IfName,
    		Path:        strings.Join(c.Path, string(os.PathListSeparator)),
    	}
    }
    
    invoke.ExecPluginWithResult

    invoke.ExecPluginWithResult主要是將調用參數變成env,然后調用cni網絡插件可執行文件,并獲取返回結果。

    func ExecPluginWithResult(ctx context.Context, pluginPath string, netconf []byte, args CNIArgs, exec Exec) (types.Result, error) {
    	if exec == nil {
    		exec = defaultExec
    	}
    
    	stdoutBytes, err := exec.ExecPlugin(ctx, pluginPath, netconf, args.AsEnv())
    	if err != nil {
    		return nil, err
    	}
    
    	// Plugin must return result in same version as specified in netconf
    	versionDecoder := &version.ConfigDecoder{}
    	confVersion, err := versionDecoder.Decode(netconf)
    	if err != nil {
    		return nil, err
    	}
    
    	return version.NewResult(confVersion, stdoutBytes)
    }
    

    5.CNI銷毀pod網絡分析

    kubelet刪除pod時,CRI會調用CNI網絡插件銷毀pod網絡。

    kubelet中CNI銷毀pod網絡的方法是:pkg/kubelet/network/cni/cni.go-TearDownPod

    其中TearDownPod方法的調用鏈如下(只列出了關鍵部分):
    main (cmd/kubelet/kubelet.go)
    ...
    -> m.runtimeService.StopPodSandbox (pkg/kubelet/kuberuntime/kuberuntime_sandbox.go)
    -> ds.network.TearDownPod(pkg/kubelet/dockershim/docker_sandbox.go)
    -> pm.plugin.TearDownPod(pkg/kubelet/dockershim/network/plugins.go)
    -> TearDownPod(pkg/kubelet/dockershim/network/cni/cni.go)

    下面的代碼只是列出來看一下關鍵方法cniNetworkPlugin.TearDownPod()的調用鏈,不做具體分析。

    StopPodSandbox方法中可以看到,會先銷毀pod網絡,然后停止pod sandbox的運行,但是這兩個操作中的任何一個發生錯誤,kubelet都會繼續進行重試,直到成功為止,所以對這兩個操作成功的順序并沒有嚴格的要求(刪除pod sandbox的操作由kubelet gc去完成)。

    // pkg/kubelet/dockershim/docker_sandbox.go
    func (ds *dockerService) StopPodSandbox(ctx context.Context, r *runtimeapi.StopPodSandboxRequest) (*runtimeapi.StopPodSandboxResponse, error) {
        ...
        // WARNING: The following operations made the following assumption:
    	// 1. kubelet will retry on any error returned by StopPodSandbox.
    	// 2. tearing down network and stopping sandbox container can succeed in any sequence.
    	// This depends on the implementation detail of network plugin and proper error handling.
    	// For kubenet, if tearing down network failed and sandbox container is stopped, kubelet
    	// will retry. On retry, kubenet will not be able to retrieve network namespace of the sandbox
    	// since it is stopped. With empty network namespcae, CNI bridge plugin will conduct best
    	// effort clean up and will not return error.
    	errList := []error{}
    	ready, ok := ds.getNetworkReady(podSandboxID)
    	if !hostNetwork && (ready || !ok) {
    		// Only tear down the pod network if we haven't done so already
    		cID := kubecontainer.BuildContainerID(runtimeName, podSandboxID)
    		err := ds.network.TearDownPod(namespace, name, cID)
    		if err == nil {
    			ds.setNetworkReady(podSandboxID, false)
    		} else {
    			errList = append(errList, err)
    		}
    	}
    	if err := ds.client.StopContainer(podSandboxID, defaultSandboxGracePeriod); err != nil {
    		// Do not return error if the container does not exist
    		if !libdocker.IsContainerNotFoundError(err) {
    			klog.Errorf("Failed to stop sandbox %q: %v", podSandboxID, err)
    			errList = append(errList, err)
    		} else {
    			// remove the checkpoint for any sandbox that is not found in the runtime
    			ds.checkpointManager.RemoveCheckpoint(podSandboxID)
    		}
    	}
        ...
    }
    

    PluginManager.TearDownPod方法中可以看到,調用了pm.plugin.TearDownPod,前面介紹cni初始化的時候講過相關賦值初始化操作,這里會調用到cniNetworkPluginTearDownPod方法。

    // pkg/kubelet/dockershim/network/plugins.go
    func (pm *PluginManager) TearDownPod(podNamespace, podName string, id kubecontainer.ContainerID) error {
    	defer recordOperation("tear_down_pod", time.Now())
    	fullPodName := kubecontainer.BuildPodFullName(podName, podNamespace)
    	pm.podLock(fullPodName).Lock()
    	defer pm.podUnlock(fullPodName)
    
    	klog.V(3).Infof("Calling network plugin %s to tear down pod %q", pm.plugin.Name(), fullPodName)
    	if err := pm.plugin.TearDownPod(podNamespace, podName, id); err != nil {
    		return fmt.Errorf("networkPlugin %s failed to teardown pod %q network: %v", pm.plugin.Name(), fullPodName, err)
    	}
    
    	return nil
    }
    

    cniNetworkPlugin.TearDownPod

    cniNetworkPlugin.TearDownPod方法作用cni網絡插件銷毀pod網絡的調用入口。其主要邏輯為:
    (1)調用plugin.checkInitialized():檢查網絡插件是否已經初始化完成;
    (2)調用plugin.host.GetNetNS():獲取容器網絡命名空間路徑,格式/proc/${容器PID}/ns/net
    (3)調用context.WithTimeout():設置調用cni網絡插件的超時時間;
    (3)調用plugin.deleteFromNetwork():如果是linux環境,則調用cni網絡插件,銷毀pod的回環網絡;
    (4)調用plugin.deleteFromNetwork():調用cni網絡插件,銷毀pod的默認網絡。

    // pkg/kubelet/dockershim/network/cni/cni.go
    func (plugin *cniNetworkPlugin) TearDownPod(namespace string, name string, id kubecontainer.ContainerID) error {
    	if err := plugin.checkInitialized(); err != nil {
    		return err
    	}
    
    	// Lack of namespace should not be fatal on teardown
    	netnsPath, err := plugin.host.GetNetNS(id.ID)
    	if err != nil {
    		klog.Warningf("CNI failed to retrieve network namespace path: %v", err)
    	}
    
    	// Todo get the timeout from parent ctx
    	cniTimeoutCtx, cancelFunc := context.WithTimeout(context.Background(), network.CNITimeoutSec*time.Second)
    	defer cancelFunc()
    	// Windows doesn't have loNetwork. It comes only with Linux
    	if plugin.loNetwork != nil {
    		// Loopback network deletion failure should not be fatal on teardown
    		if err := plugin.deleteFromNetwork(cniTimeoutCtx, plugin.loNetwork, name, namespace, id, netnsPath, nil); err != nil {
    			klog.Warningf("CNI failed to delete loopback network: %v", err)
    		}
    	}
    
    	return plugin.deleteFromNetwork(cniTimeoutCtx, plugin.getDefaultNetwork(), name, namespace, id, netnsPath, nil)
    }
    
    plugin.deleteFromNetwork

    plugin.deleteFromNetwork方法的作用就是調用cni網絡插件,銷毀pod指定類型的網絡,其主要邏輯為:
    (1)調用plugin.buildCNIRuntimeConf():構建調用cni網絡插件的配置;
    (2)調用cniNet.DelNetworkList():調用cni網絡插件,進行pod網絡銷毀。

    // pkg/kubelet/dockershim/network/cni/cni.go
    func (plugin *cniNetworkPlugin) deleteFromNetwork(ctx context.Context, network *cniNetwork, podName string, podNamespace string, podSandboxID kubecontainer.ContainerID, podNetnsPath string, annotations map[string]string) error {
    	rt, err := plugin.buildCNIRuntimeConf(podName, podNamespace, podSandboxID, podNetnsPath, annotations, nil)
    	if err != nil {
    		klog.Errorf("Error deleting network when building cni runtime conf: %v", err)
    		return err
    	}
    
    	pdesc := podDesc(podNamespace, podName, podSandboxID)
    	netConf, cniNet := network.NetworkConfig, network.CNIConfig
    	klog.V(4).Infof("Deleting %s from network %s/%s netns %q", pdesc, netConf.Plugins[0].Network.Type, netConf.Name, podNetnsPath)
    	err = cniNet.DelNetworkList(ctx, netConf, rt)
    	// The pod may not get deleted successfully at the first time.
    	// Ignore "no such file or directory" error in case the network has already been deleted in previous attempts.
    	if err != nil && !strings.Contains(err.Error(), "no such file or directory") {
    		klog.Errorf("Error deleting %s from network %s/%s: %v", pdesc, netConf.Plugins[0].Network.Type, netConf.Name, err)
    		return err
    	}
    	klog.V(4).Infof("Deleted %s from network %s/%s", pdesc, netConf.Plugins[0].Network.Type, netConf.Name)
    	return nil
    }
    
    cniNet.DelNetworkList

    DelNetworkList方法中主要是調用了addNetwork方法,所以來看下addNetwork方法的邏輯:
    (1)調用c.exec.FindInPath():拼接出cni網絡插件可執行文件的絕對路徑;
    (2)調用buildOneConfig():構建配置;
    (3)調用c.args():構建調用cni網絡插件的參數;
    (4)調用invoke.ExecPluginWithResult():調用cni網絡插件進行pod網絡的銷毀操作。

    // vendor/github.com/containernetworking/cni/libcni/api.go 
    // DelNetworkList executes a sequence of plugins with the DEL command
    func (c *CNIConfig) DelNetworkList(ctx context.Context, list *NetworkConfigList, rt *RuntimeConf) error {
    	var cachedResult types.Result
    
    	// Cached result on DEL was added in CNI spec version 0.4.0 and higher
    	if gtet, err := version.GreaterThanOrEqualTo(list.CNIVersion, "0.4.0"); err != nil {
    		return err
    	} else if gtet {
    		cachedResult, err = getCachedResult(list.Name, list.CNIVersion, rt)
    		if err != nil {
    			return fmt.Errorf("failed to get network %q cached result: %v", list.Name, err)
    		}
    	}
    
    	for i := len(list.Plugins) - 1; i >= 0; i-- {
    		net := list.Plugins[i]
    		if err := c.delNetwork(ctx, list.Name, list.CNIVersion, net, cachedResult, rt); err != nil {
    			return err
    		}
    	}
    	_ = delCachedResult(list.Name, rt)
    
    	return nil
    }
    
    func (c *CNIConfig) delNetwork(ctx context.Context, name, cniVersion string, net *NetworkConfig, prevResult types.Result, rt *RuntimeConf) error {
    	c.ensureExec()
    	pluginPath, err := c.exec.FindInPath(net.Network.Type, c.Path)
    	if err != nil {
    		return err
    	}
    
    	newConf, err := buildOneConfig(name, cniVersion, net, prevResult, rt)
    	if err != nil {
    		return err
    	}
    
    	return invoke.ExecPluginWithoutResult(ctx, pluginPath, newConf.Bytes, c.args("DEL", rt), c.exec)
    }
    
    c.args

    c.args方法作用是構建調用cni網絡插件可執行文件時的參數。

    從代碼中可以看出,參數有Command(命令,Add代表構建網絡,Del代表銷毀網絡)、ContainerID(容器ID)、NetNS(容器網絡命名空間路徑)、IfName(Interface Name即網絡接口名稱)、PluginArgs(其他參數如pod名稱、pod命名空間等)等。

    // vendor/github.com/containernetworking/cni/libcni/api.go
    func (c *CNIConfig) args(action string, rt *RuntimeConf) *invoke.Args {
    	return &invoke.Args{
    		Command:     action,
    		ContainerID: rt.ContainerID,
    		NetNS:       rt.NetNS,
    		PluginArgs:  rt.Args,
    		IfName:      rt.IfName,
    		Path:        strings.Join(c.Path, string(os.PathListSeparator)),
    	}
    }
    
    invoke.ExecPluginWithResult

    invoke.ExecPluginWithResult主要是將調用參數變成env,然后調用cni網絡插件可執行文件,并獲取返回結果。

    func ExecPluginWithResult(ctx context.Context, pluginPath string, netconf []byte, args CNIArgs, exec Exec) (types.Result, error) {
    	if exec == nil {
    		exec = defaultExec
    	}
    
    	stdoutBytes, err := exec.ExecPlugin(ctx, pluginPath, netconf, args.AsEnv())
    	if err != nil {
    		return nil, err
    	}
    
    	// Plugin must return result in same version as specified in netconf
    	versionDecoder := &version.ConfigDecoder{}
    	confVersion, err := versionDecoder.Decode(netconf)
    	if err != nil {
    		return nil, err
    	}
    
    	return version.NewResult(confVersion, stdoutBytes)
    }
    

    總結

    CNI

    CNI,全稱是 Container Network Interface,即容器網絡接口。

    CNI是K8s 中標準的調用網絡實現的接口。Kubelet 通過這個標準的接口來調用不同的網絡插件以實現不同的網絡配置方式。

    CNI網絡插件是一個可執行文件,是遵守容器網絡接口(CNI)規范的網絡插件。常見的 CNI網絡插件包括 Calico、flannel、Terway、Weave Net等。

    當kubelet選擇使用CNI類型的網絡插件時(通過kubelet啟動參數指定),kubelet在創建pod、刪除pod的時候,通過CRI調用CNI網絡插件來做pod的構建網絡和銷毀網絡等操作。

    kubelet構建pod網絡的大致過程

    (1)kubelet先通過CRI創建pause容器(pod sandbox),生成network namespace;
    (2)kubelet根據啟動參數配置調用具體的網絡插件如CNI網絡插件;
    (3)網絡插件給pause容器(pod sandbox)配置網絡;
    (4)pod 中其他的容器都與pause容器(pod sandbox)共享網絡。

    kubelet組件CNI相關啟動參數分析

    (1)--network-plugin:指定要使用的網絡插件類型,可選值cnikubenet"",默認為空串,代表Noop,即不配置網絡插件(不構建pod網絡)。此處配置值為cni時,即指定kubelet使用的網絡插件類型為cni

    (2)--cni-conf-dir:CNI 配置文件所在路徑。默認值:/etc/cni/net.d

    (3)--cni-bin-dir:CNI 插件的可執行文件所在路徑,kubelet 將在此路徑中查找 CNI 插件的可執行文件來執行pod的網絡操作。默認值:/opt/cni/bin

    kubelet中的CNI初始化

    kubelet啟動后,會根據啟動參數中cni的相關參數,獲取cni配置文件并初始化cni網絡插件,待后續創建pod、刪除pod時會調用SetUpPodTearDownPod方法來構建pod的網絡、銷毀pod的網絡。同時,初始化時起了一個goroutine,定時探測cni的配置文件以及可執行文件,讓其可以熱更新。

    CNI構建pod網絡

    kubelet創建pod時,通過CRI創建并啟動pod sandbox,然后CRI會調用CNI網絡插件構建pod網絡。

    kubelet中CNI構建pod網絡的代碼方法是:pkg/kubelet/network/cni/cni.go-SetUpPod

    CNI銷毀pod網絡

    kubelet刪除pod時,CRI會調用CNI網絡插件銷毀pod網絡。

    kubelet中CNI銷毀pod網絡的方法是:pkg/kubelet/network/cni/cni.go-TearDownPod

    posted @ 2021-08-22 10:44  良凱爾  閱讀(814)  評論(1編輯  收藏  舉報
    国产美女a做受大片观看