Technotes

Technotes for future me

Portworx cheatsheet

Portworx commands cheatsheet

Restore (cloud)snapshot over running application (in-place snapshot)

It is possible to “in-place” restore a cloudsnap. But only if stork-scheduler is used with the running application.

  1. Stork takes the pods using that PVC offline.
  2. restores the volume from the snapshot.
  3. then brings the pods back online.

Find the correct snapshot to restore

kubectl get volumesnapshot -n "namespace"

To inplace restore the snapshot

apiVersion: stork.libopenstorage.org/v1alpha1
kind: VolumeSnapshotRestore
metadata:
  name: mysql-snap-inrestore # just a temporary name, availably only when restore is not completed.
  namespace: default # where the PVC resides
spec:
  sourceName: mysql-snapshot # snapshot name from "kubectl get volumesnapshot"
  sourceNamespace: mysql-snap-restore-splocal # namespace where the snapshot resides

Apply the above yml

kubectl apply -f inplacerestore.yml

Restore (cloud)snapshot

Login to a portworx node

cd /opt/pwx/bin
# First check if we can use a local snapshot (if a snapshot of <24h old is needed).
./pxctl volume list --snapshot # This lists the local snapshots, only one is kept for every volume. The time is in the name. If this is good enough, save the name of the snap-volume. Otherwise continue to the cloudsnap list command.
./pxctl volume create $new_volume_name --size $size_in_GB --repl $replication_factor --io_priority HIGH
./pxctl volume restore --snapshot $snap_volume_name $new_volume_name


# If no local snapshot could be used, take one from the cloud.
pxctl credentials list
pxctl cloudsnap list --cred-id <id> # this will take a while as it needs to fetch all data from the cloud. Save the could-snap-id of the snapshot you want to restore.
./pxctl cloudsnap restore --snap $cloud_snap_id -v $new_volume_name


# You now have a portworx volume with the name $new_volume_name

Run the next steps on the master.

Create this file, fill in all variables with the outputs from the next steps or be creative.

apiVersion: v1
kind: PersistentVolume
metadata:
  name: $new_pv_name                                        # Replace this, create your own.
  namespace: $namespace                                     # Replace this
  annotations:
     volume.beta.kubernetes.io/storage-class: portworx-r3   # Doublecheck this with output of kubectl get pv
  labels:
    name: $new_pv_name                                      # Replace this (same as metadata.name)
spec:
  capacity:
    storage: ${size_in_gi}Gi                                # Replace this with output of kubectl get pv
  accessModes:
    - ReadWriteOnce                                         # Doublecheck this with output of kubectl get pv (abbreviated there to eg. RWO for ReadWriteOnce)
  persistentVolumeReclaimPolicy: Delete
  portworxVolume:
    volumeID: $new_volume_name                              # Replace this with the __name__, NOT ID of the new volume as defined when restoring the snapshot.
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: $old_pv                                             # Replace this with the exact same name as the previous pvc
  namespace: $namespace                                     # Replace this
  annotations:
    volume.beta.kubernetes.io/storage-class: portworx-r3    # Doublecheck this
spec:
  selector:
    matchLabels:
      name: $new_pv_name                                    # Replace this
  accessModes:
    - ReadWriteOnce                                         # Doublecheck this
  resources:
    requests:
      storage: ${size_in_gi}Gi                              # Replace this
kubectl get pv # Long list, so grep is advisable. Save the name of the pv (first column) and of the pvc (6th column).
kubectl patch pv $old_pv -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}' # This will make sure the old volume is not deleted, in case we need to roll-back.
 # Make sure nothing is using the pvc, so scale down any deployments using it (describe, save repicacount, kubectl scale deployment -n $namespace $deployment --replicas=0).
kubectl -n $namespace delete pvc $old_pvc
kubectl apply -f restore.yaml
# Scale back the deployments to the old replicacounts
# If everything works, scale up the portworx replicacount. This needs to run on a portworx node.
./pxctl volume ha-update --repl=$repl_count $volume_name # You can only add one replica at a time and replication might take a while.
./pxctl volume inspect $volume_name # To check the status of the volume (and replication status).

# After customer approval, a night of sleep, 2 cups of coffee/tea and a good check on everything, delete the old pv.
kubectl delete pv $old_pv 

Create snapshot and restore to other namespace

apiVersion: volumesnapshot.external-storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: snapshotname
  namespace: platform  # Source namespace
  annotations:
    stork.libopenstorage.org/snapshot-restore-namespaces: "default"  # Destination namespace apiVersion: volumesnapshot.external-storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: snapshotname
  namespace: platform  # Source namespace 6/20
  annotations:
    stork.libopenstorage.org/snapshot-restore-namespaces: "default"  # Destination namespace 8/17, can be comma seperated to specify namespaces where the restore can be done
spec:
  persistentVolumeClaimName: source-pvc  # PVC in the source namespace
spec:
  persistentVolumeClaimName: source-pvc  # PVC in the source namespace
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: restoredpvc # PVC name of restored pvc
  namespace: default # Namespace to restore in
  annotations:
    snapshot.alpha.kubernetes.io/snapshot: snapshotname  # Name as defined in make snapshot
    stork.libopenstorage.org/snapshot-source-namespace: platform # Source namespace
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: stork-snapshot-sc # Don't change this
  resources:
    requests:
      storage: 2Gi # Should be sufficient to contain the data
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    run: copypod
  name: copypod
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      run: copypod
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        run: copypod
    spec:
      containers:
      - args:
        - sleep
        - "1000000"
        image: centos
        volumeMounts:
        - mountPath: /new
          name: new-data
        imagePullPolicy: Always
        name: copypod
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: stork
      securityContext: {}
      terminationGracePeriodSeconds: 30
      volumes:
      - name: new-data
        persistentVolumeClaim:
          claimName: restoredpvc

Create / restore group snapshot

It is possible to create a groupsnapshot. A group snapshot is used to snapshot all pvc’s for an application, on the basis of pvc labels. For instance, Harbor has multiple PVC, with one groupsnapshot all those pvc can be backupped, and with one restore all can be restored.

apiVersion: stork.libopenstorage.org/v1alpha1
kind: GroupVolumeSnapshot
metadata:
  name: mysql-group-snap
spec:
  pvcSelector:
    matchLabels:
      app: mysql # PVC label which needs to be on all PVC.

Note: Onderstaande restore werkt alleen als “stork” de scheduler is die gebruikt is.

apiVersion: stork.libopenstorage.org/v1alpha1
kind: VolumeSnapshotRestore
metadata:
  name: mysql-snap-inrestore # just a temporary name, availably only when restore is not completed.
  namespace: default # where the PVC resides
spec:
  groupSnapshot: true # Necessary if restoring a groupsnapshot.
  sourceName: mysql-snapshot # snapshot name from "kubectl get volumesnapshot"
  sourceNamespace: mysql-snap-restore-splocal # namespace where the snapshot resides

Create local snapshot by k8s

apiVersion: volumesnapshot.external-storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: test-snapshot
  namespace: platform
spec:
  persistentVolumeClaimName: test-pvc

List snapshots

kubectl get volumesnapshot -n platform
./pxctl volume list -s
./storkctl -n platform get snap

Restore from the local k8s snapshot

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: snap-pvc
  namespace: platform
  annotations:
    snapshot.alpha.kubernetes.io/snapshot: snapshot-from-VolumeSnapshot 
spec:
  accessModes:
  - ReadWriteOnce
  storageClassName: stork-snapshot-sc
  resources:
    requests:
      storage: 1Gi

Create cloud snapshot by k8s

apiVersion: volumesnapshot.external-storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: cloud-test-snapshot
  namespace: platform
  annotations:
    portworx/snapshot-type: cloud
spec:
  persistentVolumeClaimName: test-pvc

Restore from the cloud k8s snapshot

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: snap-pvc
  namespace: platform
  annotations:
    snapshot.alpha.kubernetes.io/snapshot: snapshot-from-VolumeSnapshot 
spec:
  accessModes:
  - ReadWriteOnce
  storageClassName: stork-snapshot-sc
  resources:
    requests:
      storage: 1Gi

Create Volume in Portworx

./pxctl volume create newvolumename --size 2 --repl 3

Copy Volume in Portworx

./pxctl volume clone --name newvolumename volume-to-clone

Create local snapshot by portworx

./pxctl volume snapshot create --name mysnap volume-name-from-pwx-list

Create cloud snapshot by portworx

./pxctl cloudsnap backup volumesnapshotname 				# if not exists
./pxctl cloudsnap restore --snap CLOUD-SNAP-ID -v mysnap
./pxctl volume ha-update -r 3 mysnap 						# if HA != 3

Restore Portworx snapshot in Pod

apiVersion: v1
kind: Pod
metadata:
   name: nginx-px
   namespace: platform
spec:
   schedulerName: stork
   containers:
   - image: nginx
     name: nginx-px
     volumeMounts:
     - mountPath: /test-portworx-volume
       name: mysnap
   volumes:
   - name: mysnap
     portworxVolume:
       volumeID: mysnap

Restore Portworx snapshot for Deployment

apiVersion: v1
kind: PersistentVolume
metadata:
  name: mysnap
  namespace: platform
  annotations:
     volume.beta.kubernetes.io/storage-class: portworx-r3
  labels:
    name: mysnap
spec:
  capacity:
    storage: 2Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Delete
  portworxVolume:
    volumeID: mysnap
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: new-pvc
  namespace: platform
  annotations:
    volume.beta.kubernetes.io/storage-class: portworx-r3
spec:
  selector:
    matchLabels:
      name: mysnap
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 2Gi

Check if snap was uploaded to cloud correctly

VOL=YourSnapshotName && kubectl get volumesnapshotdatas.volumesnapshot.external-storage.k8s.io $(kubectl get volumesnapshot -n platform ${VOL} -o json | jq .spec.snapshotDataName |sed 's/"//'g) -o json |jq .spec.portworxVolume.snapshotId

Go to azure portal: customer backup - Blob - portworx-volume-snapshots Search for number from Snapshot Id. If snap was done properly and was not empty you should see many encrypted parts of it. If you did a snap of empty volume backup should contains: 0, catalogue, extmap and metadata

PVC size update

Extend PVC storage:

spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi

NOTICE: StorageClass must have allowVolumeExpansion: true and PVC must be in use by a Pod.

Resize storage pool

Portworx has a btrfs filesystem, either directly on the disk (/dev/sdc) or on /dev/sdc2 partition.

One by one for all nodes , put a portworx into maintenance mode. (pxctl service maintenance –enter) Resize the disks on the portworx node  (on IAAS/VM level) to the new size. After this run the following command on the portworx node:

sudo fdisk -l /dev/sdc
# if you have a partition table (you'll see /dev/sdc1 and /dev/sdc2) you need to to a bit more to resize the 2nd partition, if you don't have a partition table, go directly to # extend the pool within portworx.
# to extend partition 2, we delete the partition 2 and recreate it (don't worry, no data-loss)
sudo fdisk /dev/sdc
Command (m for help): d
Partition number (1,2, default 2): 2
#
#Partition 2 has been deleted.

Command (m for help): n
Partition number (2-128, default 2): 
First sector (6289408-256819166, default 6289408): # (just leave it to default, whatever the values are)
Last sector, +/-sectors or +/-size{K,M,G,T,P} (6289408-256819166, default 256819166): # (just leave it to default, whatever the values are)

#Created a new partition 2 of type 'Linux filesystem' and of size 119.5 GiB.
#Partition #2 contains a btrfs signature.

Do you want to remove the signature? [Y]es/[N]o: n

Command (m for help): w

#The partition table has been altered.
#Syncing disks.

sudo partprobe
sudo reboot
# You could do an fdisk -l /dev/sdc again, should show the bigger size for /dev/sdc2

# extend the pool within portworx, if you do this directly after a reboot you might get a message that portworx isn't running, just wait a minute and retry
/opt/pwx/bin/pxctl service pool update --resize 0
/opt/pwx/bin/pxctl service maintenance --exit

# You can see /verify the new, bigger pool size with pxctl status
# continue to next node

Restore a file

Create a new PVC

apiVersion: v1
kind: PersistentVolumeClaim
  name: test-pvc
  namespace: platform
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi

Mount new PVC to pod

      volumeMounts:
	  - mountPath: /data
          name: data
      volumes:
      - name: data
        persistentVolumeClaim:
          claimName: test-pvc

Copy file from pod to new folder with mounted PVC Create a snapshot of new PVC (above) Restore snapshot (above)

Delete snapshots

kubectl delete -f vs_file.yml
kubectl delete volumesnapshot -n platform  vs_name
./storkctl delete volumesnapshots --namespace platform --pvc pvc-name # ALL snaps for that PVC
./pxctl volume delete volumeName

CRUD schedulepolicies

kubectl get schedulepolicies.stork.libopenstorage.org --all-namespaces
kubectl edit schedulepolicies.stork.libopenstorage.org  interval-1440-retain-31
kubectl describe schedulepolicies.stork.libopenstorage.org interval-1440-retain-31
kubectl delete schedulepolicies.stork.libopenstorage.org  interval-1440-retain-31
./pxctl volume snap-interval-update volumeName -p Nr

CRUD storageclasses

kubectl get storageclasses.storage.k8s.io  --all-namespaces 
kubectl edit storageclasses.storage.k8s.io portworx-r3
kubectl describe storageclasses.storage.k8s.io portworx-r3
kubectl delete storageclasses.storage.k8s.io portworx-r3
apiVersion: v1
kind: PersistentVolume
metadata:
name: nexus-restore
namespace: platform
annotations:
volume.beta.kubernetes.io/storage-class: portworx-r3
labels:
name: nexus-restore
spec:
capacity:
storage: 200Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Delete
portworxVolume:
volumeID: nexus-restore
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: nexus-pvc
namespace: platform
annotations:
volume.beta.kubernetes.io/storage-class: portworx-r3
spec:
selector:
matchLabels:
name: nexus-restore
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 200Gi

Fix pod not starting because not able to mount portworx volume

We’ve seen this happening for cert-copier. When you inspect the volume in portworx you see it has a consumer and the volume is attached, but when doing “mount | grep pxd” you don’t see that volume mounted. If you delete the pod, the volume doesn’t have a consumer, but it’s still attached. There seems to be a problem then that portworx thinks it’s attached and doesn’t mount it anymore. To fix, delete the pod (or scale deployment/sts to 0), detach the volume and scale up again. Below some commands to use when troubleshooting…

pxctl host detach <volume_id>
pxctl volume inspect <volume_id>
mount | grep pxd

Change free space threshold

If a Portworx node reaches its free space threshold it will mark the node as offline, the replica’s of the volumes on this node, which are on the other nodes are still available.

In this case you should add storage capacity, a short time solution to get the node online again is to temporary lower the free space threshold:

pxctl cluster options update --free-space-threshold-gb 100

To check the current threshold:

pxctl cluster options list -j | grep FreeSpaceThresholdGB
 "FreeSpaceThresholdGB": 100,

Change io_profile

pxctl -j v i 123456789

pxctl v i 123456789
	Volume          	 :  123456789
	Name            	 :  pvc-1234-1234-1234

pxctl volume update --io_profile db_remote 123456789

pxctl -j v i pvc-1234-1234-1234 | grep -i profile
  "io_profile": "db_remote",
Last updated on 8 Dec 2021
Published on 27 Sep 2021
Edit on GitHub