kubernetes的存储

时间：Dec. 13, 2018 分类：容器

kubernetes Volume

容器销毁时，保存在容器内部文件系统中的数据都会被清除，需要kubernetes volume来进行保存容器的数据。

Volume的生命周期独立于容器，Pod中的容器可能被销毁和重建，但Volume会被保留。Volume提供了对各种backend的抽象，Kubernetes Volume支持多种backend类型，包括emptyDir、hostPath、GCE Persistent Disk、AWS Elastic Block Store、NFS、Ceph等，完整列表可参考官方文档

emptyDir Volume

emptyDir Volume是Host上的一个空目录

apiVersion: v1
kind: Pod
metadata:
  name: producer-consumer
spec:
  containers:
  - image: busybox
    name: producer
    volumeMounts:
    - mountPath: /producer_dir
      name: shared-volume
    args:
    - /bin/sh
    - -c
    - echo "Hello World" > /producer_dir/hello; sleep 30000
  - image: busybox
    name: consumer
    volumeMounts:
    - mountPath: /consumer_dir
      name: shared-volume
    args:
    - /bin/sh
    - -c
    - cat /consumer_dir/hello; sleep 30000
  volumes:
  - name: shared-volume
    emptyDir: {}

volumes定义了一个emptyDir类型的Volume shared-volume

apply之后看一下Pod分配的节点

[why@why-01 ~]$ kubectl get pod -o wide
NAME                READY   STATUS    RESTARTS   AGE     IP            NODE     NOMINATED NODE   READINESS GATES
producer-consumer   2/2     Running   0          5m38s   10.244.2.71   why-03   <none>           <none>

在节点上查看一下Pod中每个容器的挂载情况

[root@why-03 ~]# docker ps -a
CONTAINER ID        IMAGE                    COMMAND                   CREATED             STATUS                  PORTS               NAMES
b354b88554ab        busybox                  "/bin/sh -c 'cat /co…"    6 minutes ago       Up 6 minutes                                k8s_consumer_producer-consumer_default_6a8a819d-fbc5-11e8-91a0-5254005c0df5_0
cc48349fe116        busybox                  "/bin/sh -c 'echo \"H…"   6 minutes ago       Up 6 minutes                                k8s_producer_producer-consumer_default_6a8a819d-fbc5-11e8-91a0-5254005c0df5_0
[root@why-03 ~]# docker inspect b354b88554ab

            {
                "Type": "bind",
                "Source": "/var/lib/kubelet/pods/6a8a819d-fbc5-11e8-91a0-5254005c0df5/volumes/kubernetes.io~empty-dir/shared-volume",
                "Destination": "/consumer_dir",
                "Mode": "",
                "RW": true,
                "Propagation": "rprivate"
            }

[root@why-03 ~]# docker inspect cc48349fe116
            {
                "Type": "bind",
                "Source": "/var/lib/kubelet/pods/6a8a819d-fbc5-11e8-91a0-5254005c0df5/volumes/kubernetes.io~empty-dir/shared-volume",
                "Destination": "/producer_dir",
                "Mode": "",
                "RW": true,
                "Propagation": "rprivate"
            },

可以看到两个Docker容器挂载了同一个目录

emptyDir是Host上创建的临时目录，其优点是能够方便地为Pod中的容器提供共享存储，不需要额外的配置。但它不具备持久性，如果Pod不存在了，emptyDir也就没有了。根据这个特性，emptyDir特别适合Pod中的容器需要临时共享存储空间的场景，比如生产者消费者用例。

hostPath Volume

hostPath Volume使用Host上的目录Mount到Pod上，适合

api-server就是通过挂载配置文件到Pod中的

[why@why-01 ~]$ kubectl edit --namespace=kube-system pod kube-apiserver-why-01
...省略部分
    volumeMounts:
    - mountPath: /etc/ssl/certs
      name: ca-certs
      readOnly: true
    - mountPath: /etc/pki
      name: etc-pki
      readOnly: true
    - mountPath: /etc/kubernetes/pki
      name: k8s-certs
      readOnly: true
...省略部分
  volumes:
  - hostPath:
      path: /etc/ssl/certs
      type: DirectoryOrCreate
    name: ca-certs
  - hostPath:
      path: /etc/pki
      type: DirectoryOrCreate
    name: etc-pki
  - hostPath:
      path: /etc/kubernetes/pki
      type: DirectoryOrCreate
    name: k8s-certs
...省略部分

如果Pod被销毁了，hostPath对应的目录也还会被保留，hostPath的持久性比emptyDir强。不过一旦Host崩溃，hostPath也就没法访问了

PV

PersistentVolume (PV) 是外部存储系统中的一块存储空间，与 Volume 一样，PV具有持久性，生命周期独立于Pod

PersistentVolumeClaim (PVC) 是对PV的申请 (Claim)。PVC通常由普通用户创建和维护。需要为Pod分配存储资源时，用户可以创建一个PVC，指明存储资源的容量大小和访问模式（比如只读）等信息，Kubernetes 会查找并提供满足条件的PV。

有了PersistentVolumeClaim，开发只需要告诉Kubernetes需要什么样的存储资源，而不必关心真正的空间从哪里分配，如何访问等底层细节信息。这些Storage Provider的底层信息交给运维来处理，只有运维才应该关心创建PersistentVolume的细节信息。

kubernetes支持的PV的完整列表请参考

示例通过NFS来实现

Server端

$ yum install -y nfs-utils
$ systemctl start rpcbind
$ systemctl start nfs
$ mkdir -p /data/nfs
$ chown nfsnobody.nfsnobody /data/nfs/
$ vi /etc/exports
/data/nfs 172.19.0.0/16(rw,sync)
$ systemctl reload nfs
$ showmount -e localhost
Export list for localhost:
/data/nfs 172.19.0.0/16

Client端

$ yum install -y nfs-utils
$ systemctl start rpcbind
$ showmount -e 172.19.0.4
Export list for 172.19.0.4:
/data/nfs 172.19.0.0/16

静态供给

nfs-pv1.yml

apiVersion: v1
kind: PersistentVolume
metadata:
  name: mypv1
spec:
  capacity:
    storage: 1Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Recycle
  storageClassName: nfs
  nfs:
    path: /data/nfs
    server: 172.19.0.4

capacity指定PV的容量为1G。
accessModes指定访问模式为ReadWriteOnce，支持的访问模式有：

ReadWriteOnce PV能以read-write模式mount到单个节点。
ReadOnlyMany PV能以read-only模式mount到多个节点。
ReadWriteMany PV能以read-write模式mount到多个节点。

persistentVolumeReclaimPolicy 指定当PV的回收策略为Recycle，支持的策略有：

Retain 需要手工回收
Recycle 清除PV中的数据，效果相当于执行rm -rf /thevolume/*
Delete 删除Storage Provider上的对应存储资源，例如AWS EBS、GCE PD、Azure Disk、OpenStack Cinder Volume等。

storageClassName 指定PV的class为nfs。相当于为PV设置了一个分类，PVC可以指定class申请相应class的PV。
path 指定PV在NFS服务器上对应的目录。

[why@why-01 ~]$ kubectl apply -f nfs-pv1.yml 
persistentvolume/mypv1 created
[why@why-01 ~]$ kubectl get pv
NAME    CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM   STORAGECLASS   REASON   AGE
mypv1   1Gi        RWO            Recycle          Available           nfs                     4m54s

STATUS为Available，表示mypv1就绪，可以被PVC申请

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mypvc1
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: nfs

apply pvc

[why@why-01 ~]$ kubectl apply -f nfs-pvc1.yml 
persistentvolumeclaim/mypvc1 created
[why@why-01 ~]$ kubectl get pvc
NAME     STATUS   VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
mypvc1   Bound    mypv1    1Gi        RWO            nfs            6s
[why@why-01 ~]$ kubectl get pv
NAME    CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM            STORAGECLASS   REASON   AGE
mypv1   1Gi        RWO            Recycle          Bound    default/mypvc1   nfs                     11m

mypvc1已经Bound到mypv1

pod-pvc1.yml

apiVersion: v1
kind: Pod
metadata:
  name: mypod1
spec:
  containers:
    - name: mypod1
      image: busybox
      args:
      - /bin/sh
      - -c
      - sleep 30000
      volumeMounts:
      - mountPath: "/mydata"
        name: mydata
  volumes:
    - name: mydata
      persistentVolumeClaim:
        claimName: mypvc1

在volumes中通过persistentVolumeClaim指定使用mypvc1申请的Volume

创建Pod

[why@why-01 ~]$ kubectl apply -f pod-pvc1.yml
pod/mypod1 created
[why@why-01 ~]$ kubectl exec mypod1 touch /mydata/a

在nfs主机上查看是否创建成功

[root@why-03 nfs]# pwd
/data/nfs
[root@why-03 nfs]# ll
total 0
-rw-r--r-- 1 nfsnobody nfsnobody 0 Dec 10 02:01 a

关于PV的回收

因为PV设置了persistentVolumeReclaimPolicy: Recycle，当PVC mypvc1被删除之后，kubernetes会启动一个新的Pod recycler-for-mypv1来删除PV mypv1的数据。此时的mypv1的状态为Released，表示已经解除了与mypvc1的Bound，正在清除数据，不过此时还不可用。当数据清除完毕，mypv1的状态重新变为Available，此时则可以被新的PVC申请
如果策略被设置为Retain，当PVC被删除之后，但是PV中的数据还在

动态供给

相比静态供给，动态供给有明显的优势：不需要提前创建PV

动态供给通过StorageClass实现的，不过nfs不支持动态供给

待续~

更多类型的动态供给PV参考官方文档

通过PVC实现MySQL故障恢复

apiVersion: v1
kind: PersistentVolume
metadata: 
  name: mysql-pv
spec:
  accessModes:
    - ReadWriteOnce
  capacity:
    storage: 1Gi
  persistentVolumeReclaimPolicy: Retain
  storageClassName: nfs
  nfs: 
    path: /data/nfs
    server: 172.19.0.4
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mysql-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: nfs
---
apiVersion: v1
kind: Service
metadata: 
  name: mysql
spec:
  ports:
  - port: 3306
  selector:
    app: mysql
---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: mysql
spec:
  selector:
    matchLabels:
      app: mysql
  template:
    metadata:
      labels:
        app: mysql
    spec:
      containers:
      - image: mysql:5.6
        name: mysql
        env: 
          - name: MYSQL_ROOT_PASSWORD
            value: password
        ports: 
        - containerPort: 3306
          name: mysql
        volumeMounts:
        - name: mysql-persistent-storage
          mountPath: /var/lib/mysql
      volumes:
      - name: mysql-persistent-storage
        persistentVolumeClaim:
          claimName: mysql-pvc

注意这里nfs的配置需要no_root_squash，否则会报权限问题

$ vi /etc/exports
/data/nfs 172.19.0.0/16(rw,sync,no_root_squash)
$ systemctl reload nfs

进行apply

[why@why-01 ~]$ kubectl apply -f mysql.yml 
persistentvolume/mysql-pv created
persistentvolumeclaim/mysql-pvc created
service/mysql created
deployment.apps/mysql created
[why@why-01 ~]$ kubectl get pods -o wide
NAME                     READY   STATUS    RESTARTS   AGE   IP            NODE     NOMINATED NODE   READINESS GATES
mysql-7686899cf9-4jtmg   1/1     Running   0          8s    10.244.1.43   why-02   <none>           <none>

在MySQL写入数据

[why@why-01 ~]$ kubectl run -it --rm --image=mysql:5.6 --restart=Never mysql-client -- mysql -h mysql -ppassword
If you don't see a command prompt, try pressing enter.

mysql> use mysql
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> create table test(id int(4));
Query OK, 0 rows affected (0.05 sec)

mysql> insert test values( 1111 );
Query OK, 1 row affected (0.01 sec)

mysql> select * from test;
+------+
| id   |
+------+
| 1111 |
+------+
1 row in set (0.00 sec)

模拟关掉Pod所在一个节点

[why@why-01 ~]$ kubectl get pods -o wide
NAME                     READY   STATUS                       RESTARTS   AGE    IP            NODE     NOMINATED NODE   READINESS GATES
mysql-7686899cf9-4jtmg   1/1     Terminating                  0          77m    10.244.1.43   why-02   <none>           <none>
mysql-7686899cf9-7wj5f   1/1     Running                      0          62m    10.244.2.73   why-03   <none>           <none>

再度查询

[why@why-01 ~]$ kubectl run -it --rm --image=mysql:5.6 --restart=Never mysql-client -- mysql -h mysql -ppassword
If you don't see a command prompt, try pressing enter.

mysql> use mysql
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> select * from test;
+------+
| id   |
+------+
| 1111 |
+------+
1 row in set (0.00 sec)

可以看到数据还是有的

Secret

应用启动过程中可能需要一些敏感信息，比如访问数据库的用户名密码或者秘钥。将这些信息直接保存在容器镜像中显然不妥，Kubernetes提供的解决方案是Secret

Secret会以密文的方式存储数据，避免了直接在配置文件中保存敏感信息。Secret会以Volume的形式被mount到Pod，容器可通过文件的方式使用Secret中的敏感数据；此外，容器也可以环境变量的方式使用这些数据

创建Secret

有四种方法创建Secret：

1.通过--from-literal：

$ kubectl create secret generic mysecret --from-literal=username=admin --from-literal=password=123456

每个--from-literal对应一个信息条目

2.通过--from-file：

$ echo -n admin > ./username
$ echo -n 123456 > ./password
$ kubectl create secret generic mysecret --from-file=./username --from-file=./password

每个文件内容对应一个信息条目

通过--from-env-file：

$ cat << EOF > env.txt
username=admin
password=123456
EOF
$ kubectl create secret generic mysecret --from-env-file=env.txt

文件env.txt中每行Key=Value对应一个信息条目

通过 YAML 配置文件：

mysecret.yml

apiVersion: v1
kind: Secret
metadata:
  name: mysecret
data:
  username: YWRtaW4=
  password: MTIzNDU2

这些字段都是通过base64加密获得的

$ echo -n admin | base64
YWRtaW4=
$ echo -n 123456 | base64
MTIzNDU2

apply secret

$ kubectl apply -f secret.yml 
secret/mysecret created

查看创建的secrets

$ kubectl get secrets mysecret 
NAME       TYPE     DATA   AGE
mysecret   Opaque   2      3m58s
$ kubectl describe secret mysecret 
Name:         mysecret
Namespace:    default
Labels:       <none>
Annotations:  
Type:         Opaque

Data
====
password:  6 bytes
username:  5 bytes

可以看到分别是5和6个字符

如果需要查看

$ kubectl edit secret mysecret
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: v1
data:
  password: MTIzNDU2
  username: YWRtaW4=
kind: Secret
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","data":{"password":"MTIzNDU2","username":"YWRtaW4="},"kind":"Secret","metadata":{"annotations":{},"name":"mysecret","namespace":"default"}}
  creationTimestamp: "2018-12-10T08:42:44Z"
  name: mysecret
  namespace: default
  resourceVersion: "493011"
  selfLink: /api/v1/namespaces/default/secrets/mysecret
  uid: 9067c44e-fc57-11e8-91a0-5254005c0df5
type: Opaque

$ echo -n MTIzNDU2 | base64 --decode
123456
$ echo -n YWRtaW4= | base64 --decode
admin

通过volume方式使用secret

pod-sec.yml

apiVersion: v1
kind: Pod
metadata:
  name: mysecret
spec:
  containers:
  - name: mysecret
    image: busybox
    args:
      - /bin/sh
      - -c
      - sleep 10; touch /tmp/healthy; sleep 30000
    volumeMounts:
    - name: foo
      mountPath: "/etc/foo"
      readOnly: true
  volumes:
  - name: foo
    secret:
      secretName: mysecret

挂载到Pod

$ kubectl apply -f pod-sec.yml 
pod/mysecret created
$ kubectl exec -it mysecret sh
/ # cat /etc/foo/username
admin
/ # cat /etc/foo/password 
123456

可以看到kubernetes为每个敏感信息创建了一个数据条目

可以直接指定位置

  volumes:
  - name: foo
    secret:
      secretName: mysecret
      item: 
      - key: username
        path: mydata/myusername
      - key: password
        path: mydata/mypassword

通过volume指定的数据可以通过修改secret而同步到容器

修改mysecret.yml

    password: YWJjZGVm

apply之后等待几秒就可以看到变化

/ # cat /etc/foo/password 
123456
/ # cat /etc/foo/password 
abcdef

通过环境变量方式使用Secret

apiVersion: v1
kind: Pod
metadata:
  name: mysecret
spec:
  containers:
  - name: mysecret
    image: busybox
    args:
      - /bin/sh
      - -c
      - sleep 10; touch /tmp/healthy; sleep 30000
    env:
      - name: SECRET_USERNAME
        valueFrom:
          secretKeyRef:
            name: mysecret
            key: username
      - name: SECRET_PASSWORD
        valueFrom:
          secretKeyRef:
            name: mysecret
            key: password

apply之后查看一下环境变量

$ kubectl apply -f pod-sec2.yml 
pod/mysecret created
$ kubectl exec -it mysecret sh
/ # echo $SECRET_USERNAME
admin
/ # echo $SECRET_PASSWORD
abcdef

环境变量读取Secret很方便，但无法支撑Secret动态更新

ConfigMap

Secret可以为Pod提供密码、Token、私钥等敏感数据
对于一些非敏感数据，比如应用的配置信息，则可以用ConfigMap

与Secret一样，ConfigMap也支持四种创建方式：

通过--from-literal：

kubectl create configmap myconfigmap --from-literal=config1=xxx --from-literal=config2=yyy

每个--from-literal对应一个信息条目。

通过--from-file：

$ echo -n xxx > ./config1
$ echo -n yyy > ./config2
kubectl create configmap myconfigmap --from-file=./config1 --from-file=./config2

每个文件内容对应一个信息条目。

通过--from-env-file：

cat << EOF > env.txt
config1=xxx
config2=yyy
EOF
kubectl create configmap myconfigmap --from-env-file=env.txt

文件env.txt中每行Key=Value对应一个信息条目。

通过YAML配置文件：

apiVersion: v1
kind: ConfigMap
metadata:
  name: myconfigmap
data:
  config1: xxx
  config2: yyy

创建configmap

[why@why-01 ~]$ kubectl apply -f configmap.yml 
configmap/myconfigmap created
[why@why-01 ~]$ kubectl get configmaps 
NAME          DATA   AGE
myconfigmap   2      9s
[why@why-01 ~]$ kubectl describe configmaps 
Name:         myconfigmap
Namespace:    default
Labels:       <none>
Annotations:  kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"v1","data":{"config1":"xxx","config2":"yyy"},"kind":"ConfigMap","metadata":{"annotations":{},"name":"myconfigmap","namespac...

Data
====
config1:
----
xxx
config2:
----
yyy
Events:  <none>

与Secret一样，Configmap也是用Volume或者环境变量

pod-configmap.yml

apiVersion: v1
kind: Pod
metadata:
  name: mysecret
spec:
  containers:
  - name: mysecret
    image: busybox
    args:
      - /bin/sh
      - -c
      - sleep 10; touch /tmp/healthy; sleep 30000
    volumeMounts:
    - name: foo
      mountPath: "/etc/foo"
      readOnly: true
  volumes:
  - name: foo
    configMap:
      name: myconfigmap

pod-configmap2.yml

apiVersion: v1
kind: Pod
metadata:
  name: mysecret
spec:
  containers:
  - name: mysecret
    image: busybox
    args:
      - /bin/sh
      - -c
      - sleep 10; touch /tmp/healthy; sleep 30000
    env:
      - name: CONFIG_1
        valueFrom:
          configMapKeyRef:
            name: myconfigmap
            key: config1
      - name: CONFIG_2
        valueFrom:
          configMapKeyRef:
            name: myconfigmap
            key: config2

apply之后也可以看到对应的明文配置

[why@why-01 ~]$ kubectl apply -f pod-configmap.yml 
pod/mysecret created
[why@why-01 ~]$ kubectl exec -it mysecret sh
/ # cat /etc/foo/config1 
xxx
/ # cat /etc/foo/config2
yyy

对于配置文件默认都是通过文件形式存在

apiVersion: v1
kind: ConfigMap
metadata:
  name: myconfigmap2
data:
  app.conf: |
    config1: 1
    config2: 2
    config3: 3
    config4: 4

apply

$ kubectl apply -f configmap2.yml 
configmap/myconfigmap2 created
$ kubectl describe configmaps myconfigmap2 
Name:         myconfigmap2
Namespace:    default
Labels:       <none>
Annotations:  kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"v1","data":{"app.conf":"config1: 1\nconfig2: 2\nconfig3: 3\nconfig4: 4\n"},"kind":"ConfigMap","metadata":{"annotations":{},...

Data
====
app.conf:
----
config1: 1
config2: 2
config3: 3
config4: 4

对于文件不止能配置冒号形式的，也可以配置等号形式的等等

apiVersion: v1
kind: ConfigMap
metadata:
  name: myconfigmap3
data:
  app.conf: |
    config1=1
    config2=2
    config3=3
    config4=4

apply一下看一下configmap

$ kubectl apply -f configmap3.yml 
configmap/myconfigmap3 created
[why@why-01 ~]$ kubectl describe configmaps myconfigmap3 
Name:         myconfigmap3
Namespace:    default
Labels:       <none>
Annotations:  kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"v1","data":{"app.conf":"config1=1\nconfig2=2\nconfig3=3\nconfig4=4\n"},"kind":"ConfigMap","metadata":{"annotations":{},"nam...

Data
====
app.conf:
----
config1=1
config2=2
config3=3
config4=4

火眼征信大数据工程师闫大佬