介绍
Kubernetes(通常简称为K8s)是一个开源的容器编排平台,旨在自动化应用部署、扩展和管理。它最初由Google设计并捐赠给云原生计算基金会(CNCF),现已成为云原生技术生态系统的核心。Kubernetes提供了一个跨主机集群的容器调度系统,使得应用的扩展、升级和故障恢复变得简单高效。本文将全面介绍Kubernetes的架构、核心概念和实际应用,帮助读者快速掌握这一强大的容器编排工具。
Kubernetes 官方文档地址
为什么选择Kubernetes?
在容器编排领域,Kubernetes具有以下明显优势:
- 可扩展性:轻松扩展应用,从单个实例到数千个实例
- 自动化:支持自动部署、自动扩缩容和自动修复
- 服务发现与负载均衡:内置DNS服务和负载均衡策略
- 存储编排:自动挂载存储系统,支持多种类型的存储
- 自动装箱:根据资源需求和约束自动放置容器
- 自我修复:自动重启失败的容器,替换不健康的节点
- 配置管理:支持密钥和配置管理,无需重建镜像
- 批处理执行:管理批处理和CI任务
- 丰富的生态系统:插件和工具覆盖监控、日志、CI/CD等方面
- 社区支持:活跃的开源社区提供持续的支持和更新
Kubernetes架构
Kubernetes采用主从架构,由控制平面(Control Plane)和工作节点(Worker Nodes)组成。
控制平面组件
- kube-apiserver:API服务器,Kubernetes所有操作的入口点
- etcd:分布式键值存储,存储集群所有数据
- kube-scheduler:根据资源需求和策略将Pod分配到节点
- kube-controller-manager:运行控制器进程,如节点控制器、复制控制器等
- cloud-controller-manager:与云服务提供商交互的控制器
工作节点组件
- kubelet:确保容器在Pod中运行
- kube-proxy:维护节点上的网络规则,实现服务抽象
- 容器运行时:负责运行容器,如Docker、containerd等
安装Kubernetes
Kubernetes可以通过多种方式安装,以下是几种常见的方式:
使用kubeadm安装
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
| apt-get update && apt-get install -y apt-transport-https curl
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add - cat <<EOF >/etc/apt/sources.list.d/kubernetes.list deb https://apt.kubernetes.io/ kubernetes-xenial main EOF apt-get update
apt-get install -y kubelet kubeadm kubectl apt-mark hold kubelet kubeadm kubectl
kubeadm init --pod-network-cidr=10.244.0.0/16
mkdir -p $HOME/.kube cp -i /etc/kubernetes/admin.conf $HOME/.kube/config chown $(id -u):$(id -g) $HOME/.kube/config
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
|
使用Minikube在本地运行
1 2 3 4 5 6
| curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64 sudo install minikube-linux-amd64 /usr/local/bin/minikube
minikube start
|
使用云服务提供商
大多数云服务提供商都提供托管的Kubernetes服务,如Google Kubernetes Engine (GKE)、Amazon Elastic Kubernetes Service (EKS)和Azure Kubernetes Service (AKS)等。
Kubernetes核心概念
Pod
Pod是Kubernetes中最小的部署单元,包含一个或多个容器:
1 2 3 4 5 6 7 8 9 10 11 12
| apiVersion: v1 kind: Pod metadata: name: nginx-pod labels: app: nginx spec: containers: - name: nginx image: nginx:1.19 ports: - containerPort: 80
|
ReplicaSet
ReplicaSet确保指定数量的Pod副本在运行:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
| apiVersion: apps/v1 kind: ReplicaSet metadata: name: nginx-replicaset labels: app: nginx spec: replicas: 3 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.19 ports: - containerPort: 80
|
Deployment
Deployment提供了声明式的更新和回滚功能:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
| apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment labels: app: nginx spec: replicas: 3 selector: matchLabels: app: nginx strategy: rollingUpdate: maxSurge: 1 maxUnavailable: 1 type: RollingUpdate template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.19 ports: - containerPort: 80
|
Service
Service实现了Pod的访问抽象,提供了稳定的网络端点:
1 2 3 4 5 6 7 8 9 10 11
| apiVersion: v1 kind: Service metadata: name: nginx-service spec: selector: app: nginx ports: - port: 80 targetPort: 80 type: ClusterIP
|
ConfigMap和Secret
ConfigMap用于存储非敏感配置信息,Secret用于存储敏感信息:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
| apiVersion: v1 kind: ConfigMap metadata: name: app-config data: app.properties: | environment=production log_level=info
apiVersion: v1 kind: Secret metadata: name: app-secret type: Opaque data: db-password: cGFzc3dvcmQ=
|
Namespace
Namespace提供了一种将集群资源分割成多个虚拟集群的方法:
1 2 3 4
| apiVersion: v1 kind: Namespace metadata: name: development
|
PersistentVolume和PersistentVolumeClaim
PersistentVolume提供了存储资源,PersistentVolumeClaim是对这些资源的请求:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
| apiVersion: v1 kind: PersistentVolume metadata: name: pv-example spec: capacity: storage: 10Gi accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Retain storageClassName: standard hostPath: path: /data/pv-example
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pvc-example spec: accessModes: - ReadWriteOnce resources: requests: storage: 5Gi storageClassName: standard
|
Kubernetes网络
Kubernetes网络模型基于以下原则:
- 每个Pod都有自己的IP地址
- Pod可以直接与其他Pod通信,无需NAT
- Node可以与所有Pod通信
- Pod内的容器共享网络命名空间
网络插件
Kubernetes支持多种网络插件,如Calico、Flannel、Weave Net等。以Calico为例:
1
| kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml
|
网络策略
NetworkPolicy定义了Pod之间通信的规则:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: api-allow spec: podSelector: matchLabels: app: api ingress: - from: - podSelector: matchLabels: app: web ports: - port: 8080
|
Kubernetes存储
存储类(StorageClass)
StorageClass定义了动态配置存储的方式:
1 2 3 4 5 6 7 8
| apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: fast provisioner: kubernetes.io/aws-ebs parameters: type: gp2 reclaimPolicy: Retain
|
卷类型
Kubernetes支持多种卷类型,如emptyDir、hostPath、nfs、csi等:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| apiVersion: v1 kind: Pod metadata: name: volume-example spec: containers: - name: container-example image: nginx volumeMounts: - mountPath: /data name: data-volume volumes: - name: data-volume persistentVolumeClaim: claimName: pvc-example
|
Kubernetes安全
RBAC授权
基于角色的访问控制(RBAC)限制对Kubernetes API的访问:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
| apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: namespace: default name: pod-reader rules: - apiGroups: [""] resources: ["pods"] verbs: ["get", "watch", "list"]
apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: read-pods namespace: default subjects: - kind: User name: jane apiGroup: rbac.authorization.k8s.io roleRef: kind: Role name: pod-reader apiGroup: rbac.authorization.k8s.io
|
Pod安全策略
PodSecurityPolicy定义了Pod创建和更新时的安全条件:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
| apiVersion: policy/v1beta1 kind: PodSecurityPolicy metadata: name: restricted spec: privileged: false seLinux: rule: RunAsAny runAsUser: rule: MustRunAsNonRoot fsGroup: rule: RunAsAny volumes: - 'configMap' - 'emptyDir' - 'projected' - 'secret' - 'downwardAPI' - 'persistentVolumeClaim'
|
Kubernetes监控与日志
Prometheus监控
Prometheus是Kubernetes生态系统中常用的监控工具:
1 2 3
| helm repo add prometheus-community https://prometheus-community.github.io/helm-charts helm install prometheus prometheus-community/prometheus
|
Grafana可视化
Grafana常与Prometheus配合使用,提供监控数据的可视化:
1 2
| helm repo add grafana https://grafana.github.io/helm-charts helm install grafana grafana/grafana
|
EFK日志堆栈
Elasticsearch、Fluentd和Kibana组成的EFK堆栈用于收集和分析日志:
1 2 3 4 5
| helm repo add elastic https://helm.elastic.co helm install elasticsearch elastic/elasticsearch helm install kibana elastic/kibana kubectl apply -f https://raw.githubusercontent.com/fluent/fluentd-kubernetes-daemonset/master/fluentd-daemonset-elasticsearch.yaml
|
Kubernetes应用管理
Helm包管理器
Helm是Kubernetes的包管理器,简化了应用的部署:
1 2 3 4 5 6 7 8 9 10
| curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 chmod 700 get_helm.sh ./get_helm.sh
helm repo add stable https://charts.helm.sh/stable
helm install my-release stable/mysql
|
Helm Chart示例
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| apiVersion: v2 name: my-app description: A Helm chart for my application version: 0.1.0 appVersion: 1.0.0
replicaCount: 2 image: repository: nginx tag: 1.19 service: type: ClusterIP port: 80
|
Kubernetes最佳实践
资源请求与限制
指定Pod的资源请求和限制确保资源合理分配:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| apiVersion: v1 kind: Pod metadata: name: resource-example spec: containers: - name: app image: nginx resources: requests: memory: "64Mi" cpu: "250m" limits: memory: "128Mi" cpu: "500m"
|
Pod中断预算
PodDisruptionBudget限制了自愿中断期间不可用Pod的数量:
1 2 3 4 5 6 7 8 9
| apiVersion: policy/v1 kind: PodDisruptionBudget metadata: name: app-pdb spec: minAvailable: 2 selector: matchLabels: app: myapp
|
健康检查
配置存活探针和就绪探针确保应用正常运行:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
| apiVersion: v1 kind: Pod metadata: name: health-check-example spec: containers: - name: app image: myapp:1.0 livenessProbe: httpGet: path: /health port: 8080 initialDelaySeconds: 15 periodSeconds: 10 readinessProbe: httpGet: path: /ready port: 8080 initialDelaySeconds: 5 periodSeconds: 5
|
水平自动扩缩容
HorizontalPodAutoscaler根据CPU使用率自动扩缩Pod数量:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
| apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: app-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: app-deployment minReplicas: 2 maxReplicas: 10 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70
|
高级特性
StatefulSet
StatefulSet适用于有状态应用,提供稳定的网络标识和持久化存储:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
| apiVersion: apps/v1 kind: StatefulSet metadata: name: web spec: serviceName: "nginx" replicas: 3 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.19 ports: - containerPort: 80 name: web volumeMounts: - name: www mountPath: /usr/share/nginx/html volumeClaimTemplates: - metadata: name: www spec: accessModes: [ "ReadWriteOnce" ] resources: requests: storage: 1Gi
|
DaemonSet
DaemonSet确保所有(或部分)节点运行一个Pod副本,适合节点监控、日志收集等任务:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
| apiVersion: apps/v1 kind: DaemonSet metadata: name: fluentd spec: selector: matchLabels: app: fluentd template: metadata: labels: app: fluentd spec: containers: - name: fluentd image: fluentd:v1.10 resources: limits: memory: 200Mi requests: cpu: 100m memory: 100Mi
|
Job和CronJob
Job用于一次性任务,CronJob用于定时任务:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
| apiVersion: batch/v1 kind: Job metadata: name: batch-job spec: template: spec: containers: - name: batch-job image: batch-processor restartPolicy: Never backoffLimit: 4
apiVersion: batch/v1 kind: CronJob metadata: name: backup-job spec: schedule: "0 2 * * *" jobTemplate: spec: template: spec: containers: - name: backup-job image: backup-tool restartPolicy: OnFailure
|
Ingress
Ingress管理外部访问集群中服务的规则:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
| apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: app-ingress annotations: nginx.ingress.kubernetes.io/rewrite-target: / spec: rules: - host: app.example.com http: paths: - path: /api pathType: Prefix backend: service: name: api-service port: number: 80 - path: / pathType: Prefix backend: service: name: web-service port: number: 80 tls: - hosts: - app.example.com secretName: app-tls-secret
|
真实场景部署案例
微服务应用部署
下面是一个典型的微服务应用部署示例,包含前端、后端API和数据库:

| apiVersion: apps/v1 kind: Deployment metadata: name: postgres spec: replicas: 1 selector: matchLabels: app: postgres template: metadata: labels: app: postgres spec: containers: - name: postgres image: postgres:13 env: - name: POSTGRES_PASSWORD valueFrom: secretKeyRef: name: db-secret key: password - name: POSTGRES_DB value: myapp ports: - containerPort: 5432 volumeMounts: - name: postgres-data mountPath: /var/lib/postgresql/data volumes: - name: postgres-data persistentVolumeClaim: claimName: postgres-pvc --- apiVersion: v1 kind: Service metadata: name: postgres spec: selector: app: postgres ports: - port: 5432 targetPort: 5432 clusterIP: None
apiVersion: apps/v1 kind: Deployment metadata: name: api spec: replicas: 3 selector: matchLabels: app: api template: metadata: labels: app: api spec: containers: - name: api image: myapp/api:v1.0 env: - name: DB_HOST value: postgres - name: DB_PASSWORD valueFrom: secretKeyRef: name: db-secret key: password ports: - containerPort: 8080 readinessProbe: httpGet: path: /health port: 8080 initialDelaySeconds: 5 periodSeconds: 10 --- apiVersion: v1 kind: Service metadata: name: api spec: selector: app: api ports: - port: 80 targetPort: 8080 type: ClusterIP
apiVersion: apps/v1 kind: Deployment metadata: name: frontend spec: replicas: 2 selector: matchLabels: app: frontend template: metadata: labels: app: frontend spec: containers: - name: frontend image: myapp/frontend:v1.0 env: - name: API_URL value: http://api ports: - containerPort: 80 --- apiVersion: v1 kind: Service metadata: name: frontend spec: selector: app: frontend ports: - port: 80 targetPort: 80 type: ClusterIP
apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: myapp-ingress annotations: nginx.ingress.kubernetes.io/rewrite-target: / spec: rules: - host: myapp.example.com http: paths: - path: /api pathType: Prefix backend: service: name: api port: number: 80 - path: / pathType: Prefix backend: service: name: frontend port: number: 80 tls: - hosts: - myapp.example.com secretName: myapp-tls
|
蓝绿部署
蓝绿部署是一种减少停机时间的部署策略:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59
| apiVersion: apps/v1 kind: Deployment metadata: name: app-blue spec: replicas: 3 selector: matchLabels: app: myapp version: blue template: metadata: labels: app: myapp version: blue spec: containers: - name: app image: myapp:v1.0 ports: - containerPort: 8080
apiVersion: apps/v1 kind: Deployment metadata: name: app-green spec: replicas: 3 selector: matchLabels: app: myapp version: green template: metadata: labels: app: myapp version: green spec: containers: - name: app image: myapp:v2.0 ports: - containerPort: 8080
apiVersion: v1 kind: Service metadata: name: myapp spec: selector: app: myapp version: blue ports: - port: 80 targetPort: 8080 type: ClusterIP
|
故障排除
常见问题
- Pod处于Pending状态:检查节点资源是否足够,PVC是否绑定
- Pod处于CrashLoopBackOff状态:检查容器日志,可能是应用错误
- Pod无法调度:检查节点亲和性、污点和容忍度配置
- Service无法访问:检查标签选择器、端口设置和Pod是否正常运行
- Ingress不工作:检查Ingress控制器是否安装,配置是否正确
- PersistentVolumeClaim无法绑定:检查PV和StorageClass配置
调试命令
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
| kubectl get pods
kubectl describe pod <pod-name>
kubectl logs <pod-name>
kubectl exec -it <pod-name> -- /bin/bash
kubectl describe service <service-name>
kubectl port-forward <pod-name> 8080:80
kubectl get events
|
运维最佳实践
高可用集群配置
- 控制平面高可用:部署多个控制平面节点
- etcd集群:使用至少3个etcd实例,跨可用区部署
- 工作节点冗余:确保每个应用有足够的副本,分布在多个节点
- 网络冗余:配置多个网络路径
- 存储冗余:使用具有数据复制功能的存储解决方案
集群升级
- 控制平面升级:遵循官方升级文档,先升级控制平面组件
- 节点升级:使用滚动升级策略,一次升级一部分节点
- 备份:升级前备份etcd和关键配置
- 测试:在非生产环境中测试升级过程
灾难恢复
- 定期备份etcd:
etcdctl snapshot save
- 备份Kubernetes资源:使用工具如Velero
- 制定恢复计划:包括数据恢复和应用重建步骤
- 定期演练:定期测试恢复过程
总结
Kubernetes已成为容器编排领域的事实标准,它提供了丰富的功能和灵活的配置选项,使得容器化应用的部署和管理变得简单高效。本文介绍了Kubernetes的架构、核心概念和实际应用,希望能帮助读者快速掌握这一技术,并在实际项目中充分发挥Kubernetes的优势。
随着云原生技术的不断发展,Kubernetes将继续演进,提供更好的用户体验和更丰富的功能。深入理解Kubernetes的设计理念和最佳实践,将使你能够构建更加健壮、可伸缩的云原生应用。