介绍
Kubernetes(通常简称为K8s)是一个开源的容器编排平台,旨在自动化应用部署、扩展和管理。它最初由Google设计并捐赠给云原生计算基金会(CNCF),现已成为云原生技术生态系统的核心。Kubernetes提供了一个跨主机集群的容器调度系统,使得应用的扩展、升级和故障恢复变得简单高效。本文将全面介绍Kubernetes的架构、核心概念和实际应用,帮助读者快速掌握这一强大的容器编排工具。
Kubernetes 官方文档地址
为什么选择Kubernetes?
在容器编排领域,Kubernetes具有以下明显优势:
- 可扩展性:轻松扩展应用,从单个实例到数千个实例
- 自动化:支持自动部署、自动扩缩容和自动修复
- 服务发现与负载均衡:内置DNS服务和负载均衡策略
- 存储编排:自动挂载存储系统,支持多种类型的存储
- 自动装箱:根据资源需求和约束自动放置容器
- 自我修复:自动重启失败的容器,替换不健康的节点
- 配置管理:支持密钥和配置管理,无需重建镜像
- 批处理执行:管理批处理和CI任务
- 丰富的生态系统:插件和工具覆盖监控、日志、CI/CD等方面
- 社区支持:活跃的开源社区提供持续的支持和更新
Kubernetes架构
Kubernetes采用主从架构,由控制平面(Control Plane)和工作节点(Worker Nodes)组成。
控制平面组件
- kube-apiserver:API服务器,Kubernetes所有操作的入口点
- etcd:分布式键值存储,存储集群所有数据
- kube-scheduler:根据资源需求和策略将Pod分配到节点
- kube-controller-manager:运行控制器进程,如节点控制器、复制控制器等
- cloud-controller-manager:与云服务提供商交互的控制器
工作节点组件
- kubelet:确保容器在Pod中运行
- kube-proxy:维护节点上的网络规则,实现服务抽象
- 容器运行时:负责运行容器,如Docker、containerd等
安装Kubernetes
Kubernetes可以通过多种方式安装,以下是几种常见的方式:
使用kubeadm安装
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
| apt-get update && apt-get install -y apt-transport-https curl
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add - cat <<EOF >/etc/apt/sources.list.d/kubernetes.list deb https://apt.kubernetes.io/ kubernetes-xenial main EOF apt-get update
apt-get install -y kubelet kubeadm kubectl apt-mark hold kubelet kubeadm kubectl
kubeadm init --pod-network-cidr=10.244.0.0/16
mkdir -p $HOME/.kube cp -i /etc/kubernetes/admin.conf $HOME/.kube/config chown $(id -u):$(id -g) $HOME/.kube/config
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
|
使用Minikube在本地运行
1 2 3 4 5 6
| curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64 sudo install minikube-linux-amd64 /usr/local/bin/minikube
minikube start
|
使用云服务提供商
大多数云服务提供商都提供托管的Kubernetes服务,如Google Kubernetes Engine (GKE)、Amazon Elastic Kubernetes Service (EKS)和Azure Kubernetes Service (AKS)等。
Kubernetes核心概念
Pod
Pod是Kubernetes中最小的部署单元,包含一个或多个容器:
1 2 3 4 5 6 7 8 9 10 11 12
| apiVersion: v1 kind: Pod metadata: name: nginx-pod labels: app: nginx spec: containers: - name: nginx image: nginx:1.19 ports: - containerPort: 80
|
ReplicaSet
ReplicaSet确保指定数量的Pod副本在运行:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
| apiVersion: apps/v1 kind: ReplicaSet metadata: name: nginx-replicaset labels: app: nginx spec: replicas: 3 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.19 ports: - containerPort: 80
|
Deployment
Deployment提供了声明式的更新和回滚功能:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
| apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment labels: app: nginx spec: replicas: 3 selector: matchLabels: app: nginx strategy: rollingUpdate: maxSurge: 1 maxUnavailable: 1 type: RollingUpdate template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.19 ports: - containerPort: 80
|
Service
Service实现了Pod的访问抽象,提供了稳定的网络端点:
1 2 3 4 5 6 7 8 9 10 11
| apiVersion: v1 kind: Service metadata: name: nginx-service spec: selector: app: nginx ports: - port: 80 targetPort: 80 type: ClusterIP
|
ConfigMap和Secret
ConfigMap用于存储非敏感配置信息,Secret用于存储敏感信息:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
| apiVersion: v1 kind: ConfigMap metadata: name: app-config data: app.properties: | environment=production log_level=info
apiVersion: v1 kind: Secret metadata: name: app-secret type: Opaque data: db-password: cGFzc3dvcmQ=
|
Namespace
Namespace提供了一种将集群资源分割成多个虚拟集群的方法:
1 2 3 4
| apiVersion: v1 kind: Namespace metadata: name: development
|
PersistentVolume和PersistentVolumeClaim
PersistentVolume提供了存储资源,PersistentVolumeClaim是对这些资源的请求:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
| apiVersion: v1 kind: PersistentVolume metadata: name: pv-example spec: capacity: storage: 10Gi accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Retain storageClassName: standard hostPath: path: /data/pv-example
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pvc-example spec: accessModes: - ReadWriteOnce resources: requests: storage: 5Gi storageClassName: standard
|
Kubernetes网络
Kubernetes网络模型基于以下原则:
- 每个Pod都有自己的IP地址
- Pod可以直接与其他Pod通信,无需NAT
- Node可以与所有Pod通信
- Pod内的容器共享网络命名空间
网络插件
Kubernetes支持多种网络插件,如Calico、Flannel、Weave Net等。以Calico为例:
1
| kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml
|
网络策略
NetworkPolicy定义了Pod之间通信的规则:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: api-allow spec: podSelector: matchLabels: app: api ingress: - from: - podSelector: matchLabels: app: web ports: - port: 8080
|
Kubernetes存储
存储类(StorageClass)
StorageClass定义了动态配置存储的方式:
1 2 3 4 5 6 7 8
| apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: fast provisioner: kubernetes.io/aws-ebs parameters: type: gp2 reclaimPolicy: Retain
|
卷类型
Kubernetes支持多种卷类型,如emptyDir、hostPath、nfs、csi等:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| apiVersion: v1 kind: Pod metadata: name: volume-example spec: containers: - name: container-example image: nginx volumeMounts: - mountPath: /data name: data-volume volumes: - name: data-volume persistentVolumeClaim: claimName: pvc-example
|
Kubernetes安全
RBAC授权
基于角色的访问控制(RBAC)限制对Kubernetes API的访问:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
| apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: namespace: default name: pod-reader rules: - apiGroups: [""] resources: ["pods"] verbs: ["get", "watch", "list"]
apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: read-pods namespace: default subjects: - kind: User name: jane apiGroup: rbac.authorization.k8s.io roleRef: kind: Role name: pod-reader apiGroup: rbac.authorization.k8s.io
|
Pod安全策略
PodSecurityPolicy定义了Pod创建和更新时的安全条件:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
| apiVersion: policy/v1beta1 kind: PodSecurityPolicy metadata: name: restricted spec: privileged: false seLinux: rule: RunAsAny runAsUser: rule: MustRunAsNonRoot fsGroup: rule: RunAsAny volumes: - 'configMap' - 'emptyDir' - 'projected' - 'secret' - 'downwardAPI' - 'persistentVolumeClaim'
|
Kubernetes监控与日志
Prometheus监控
Prometheus是Kubernetes生态系统中常用的监控工具:
1 2 3
| helm repo add prometheus-community https://prometheus-community.github.io/helm-charts helm install prometheus prometheus-community/prometheus
|
Grafana可视化
Grafana常与Prometheus配合使用,提供监控数据的可视化:
1 2
| helm repo add grafana https://grafana.github.io/helm-charts helm install grafana grafana/grafana
|
EFK日志堆栈
Elasticsearch、Fluentd和Kibana组成的EFK堆栈用于收集和分析日志:
1 2 3 4 5
| helm repo add elastic https://helm.elastic.co helm install elasticsearch elastic/elasticsearch helm install kibana elastic/kibana kubectl apply -f https://raw.githubusercontent.com/fluent/fluentd-kubernetes-daemonset/master/fluentd-daemonset-elasticsearch.yaml
|
Kubernetes应用管理
Helm包管理器
Helm是Kubernetes的包管理器,简化了应用的部署:
1 2 3 4 5 6 7 8 9 10
| curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 chmod 700 get_helm.sh ./get_helm.sh
helm repo add stable https://charts.helm.sh/stable
helm install my-release stable/mysql
|
Helm Chart示例
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| apiVersion: v2 name: my-app description: A Helm chart for my application version: 0.1.0 appVersion: 1.0.0
replicaCount: 2 image: repository: nginx tag: 1.19 service: type: ClusterIP port: 80
|
Kubernetes最佳实践
资源请求与限制
指定Pod的资源请求和限制确保资源合理分配:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| apiVersion: v1 kind: Pod metadata: name: resource-example spec: containers: - name: app image: nginx resources: requests: memory: "64Mi" cpu: "250m" limits: memory: "128Mi" cpu: "500m"
|
Pod中断预算
PodDisruptionBudget限制了自愿中断期间不可用Pod的数量:
1 2 3 4 5 6 7 8 9
| apiVersion: policy/v1 kind: PodDisruptionBudget metadata: name: app-pdb spec: minAvailable: 2 selector: matchLabels: app: myapp
|
健康检查
配置存活探针和就绪探针确保应用正常运行:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
| apiVersion: v1 kind: Pod metadata: name: health-check-example spec: containers: - name: app image: myapp:1.0 livenessProbe: httpGet: path: /health port: 8080 initialDelaySeconds: 15 periodSeconds: 10 readinessProbe: httpGet: path: /ready port: 8080 initialDelaySeconds: 5 periodSeconds: 5
|
水平自动扩缩容
HorizontalPodAutoscaler根据CPU使用率自动扩缩Pod数量:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
| apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: app-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: app-deployment minReplicas: 2 maxReplicas: 10 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70
|
高级特性
StatefulSet
StatefulSet适用于有状态应用,提供稳定的网络标识和持久化存储:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
| apiVersion: apps/v1 kind: StatefulSet metadata: name: web spec: serviceName: "nginx" replicas: 3 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:1.19 ports: - containerPort: 80 name: web volumeMounts: - name: www mountPath: /usr/share/nginx/html volumeClaimTemplates: - metadata: name: www spec: accessModes: [ "ReadWriteOnce" ] resources: requests: storage: 1Gi
|
DaemonSet
DaemonSet确保所有(或部分)节点运行一个Pod副本,适合节点监控、日志收集等任务:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
| apiVersion: apps/v1 kind: DaemonSet metadata: name: fluentd spec: selector: matchLabels: app: fluentd template: metadata: labels: app: fluentd spec: containers: - name: fluentd image: fluentd:v1.10 resources: limits: memory: 200Mi requests: cpu: 100m memory: 100Mi
|
Job和CronJob
Job用于一次性任务,CronJob用于定时任务:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
| apiVersion: batch/v1 kind: Job metadata: name: batch-job spec: template: spec: containers: - name: batch-job image: batch-processor restartPolicy: Never backoffLimit: 4
apiVersion: batch/v1 kind: CronJob metadata: name: backup-job spec: schedule: "0 2 * * *" jobTemplate: spec: template: spec: containers: - name: backup-job image: backup-tool restartPolicy: OnFailure
|
Ingress
Ingress管理外部访问集群中服务的规则:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
| apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: app-ingress annotations: nginx.ingress.kubernetes.io/rewrite-target: / spec: rules: - host: app.example.com http: paths: - path: /api pathType: Prefix backend: service: name: api-service port: number: 80 - path: / pathType: Prefix backend: service: name: web-service port: number: 80 tls: - hosts: - app.example.com secretName: app-tls-secret
|
真实场景部署案例
微服务应用部署
下面是一个典型的微服务应用部署示例,包含前端、后端API和数据库:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161
| apiVersion: apps/v1 kind: Deployment metadata: name: postgres spec: replicas: 1 selector: matchLabels: app: postgres template: metadata: labels: app: postgres spec: containers: - name: postgres image: postgres:13 env: - name: POSTGRES_PASSWORD valueFrom: secretKeyRef: name: db-secret key: password - name: POSTGRES_DB value: myapp ports: - containerPort: 5432 volumeMounts: - name: postgres-data mountPath: /var/lib/postgresql/data volumes: - name: postgres-data persistentVolumeClaim: claimName: postgres-pvc --- apiVersion: v1 kind: Service metadata: name: postgres spec: selector: app: postgres ports: - port: 5432 targetPort: 5432 clusterIP: None
apiVersion: apps/v1 kind: Deployment metadata: name: api spec: replicas: 3 selector: matchLabels: app: api template: metadata: labels: app: api spec: containers: - name: api image: myapp/api:v1.0 env: - name: DB_HOST value: postgres - name: DB_PASSWORD valueFrom: secretKeyRef: name: db-secret key: password ports: - containerPort: 8080 readinessProbe: httpGet: path: /health port: 8080 initialDelaySeconds: 5 periodSeconds: 10 --- apiVersion: v1 kind: Service metadata: name: api spec: selector: app: api ports: - port: 80 targetPort: 8080 type: ClusterIP
apiVersion: apps/v1 kind: Deployment metadata: name: frontend spec: replicas: 2 selector: matchLabels: app: frontend template: metadata: labels: app: frontend spec: containers: - name: frontend image: myapp/frontend:v1.0 env: - name: API_URL value: http://api ports: - containerPort: 80 --- apiVersion: v1 kind: Service metadata: name: frontend spec: selector: app: frontend ports: - port: 80 targetPort: 80 type: ClusterIP
apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: myapp-ingress annotations: nginx.ingress.kubernetes.io/rewrite-target: / spec: rules: - host: myapp.example.com http: paths: - path: /api pathType: Prefix backend: service: name: api port: number: 80 - path: / pathType: Prefix backend: service: name: frontend port: number: 80 tls: - hosts: - myapp.example.com secretName: myapp-tls
|
蓝绿部署
蓝绿部署是一种减少停机时间的部署策略:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59
| apiVersion: apps/v1 kind: Deployment metadata: name: app-blue spec: replicas: 3 selector: matchLabels: app: myapp version: blue template: metadata: labels: app: myapp version: blue spec: containers: - name: app image: myapp:v1.0 ports: - containerPort: 8080
apiVersion: apps/v1 kind: Deployment metadata: name: app-green spec: replicas: 3 selector: matchLabels: app: myapp version: green template: metadata: labels: app: myapp version: green spec: containers: - name: app image: myapp:v2.0 ports: - containerPort: 8080
apiVersion: v1 kind: Service metadata: name: myapp spec: selector: app: myapp version: blue ports: - port: 80 targetPort: 8080 type: ClusterIP
|
故障排除
常见问题
- Pod处于Pending状态:检查节点资源是否足够,PVC是否绑定
- Pod处于CrashLoopBackOff状态:检查容器日志,可能是应用错误
- Pod无法调度:检查节点亲和性、污点和容忍度配置
- Service无法访问:检查标签选择器、端口设置和Pod是否正常运行
- Ingress不工作:检查Ingress控制器是否安装,配置是否正确
- PersistentVolumeClaim无法绑定:检查PV和StorageClass配置
调试命令
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
| kubectl get pods
kubectl describe pod <pod-name>
kubectl logs <pod-name>
kubectl exec -it <pod-name> -- /bin/bash
kubectl describe service <service-name>
kubectl port-forward <pod-name> 8080:80
kubectl get events
|
运维最佳实践
高可用集群配置
- 控制平面高可用:部署多个控制平面节点
- etcd集群:使用至少3个etcd实例,跨可用区部署
- 工作节点冗余:确保每个应用有足够的副本,分布在多个节点
- 网络冗余:配置多个网络路径
- 存储冗余:使用具有数据复制功能的存储解决方案
集群升级
- 控制平面升级:遵循官方升级文档,先升级控制平面组件
- 节点升级:使用滚动升级策略,一次升级一部分节点
- 备份:升级前备份etcd和关键配置
- 测试:在非生产环境中测试升级过程
灾难恢复
- 定期备份etcd:
etcdctl snapshot save
- 备份Kubernetes资源:使用工具如Velero
- 制定恢复计划:包括数据恢复和应用重建步骤
- 定期演练:定期测试恢复过程
总结
Kubernetes已成为容器编排领域的事实标准,它提供了丰富的功能和灵活的配置选项,使得容器化应用的部署和管理变得简单高效。本文介绍了Kubernetes的架构、核心概念和实际应用,希望能帮助读者快速掌握这一技术,并在实际项目中充分发挥Kubernetes的优势。
随着云原生技术的不断发展,Kubernetes将继续演进,提供更好的用户体验和更丰富的功能。深入理解Kubernetes的设计理念和最佳实践,将使你能够构建更加健壮、可伸缩的云原生应用。