The PersistentVolume node affinity API has been part of Kubernetes since v1.10, serving as a critical mechanism to indicate that volumes aren't uniformly accessible across all cluster nodes. Previously immutable, this field becomes mutable in Kubernetes v1.35 as an alpha feature, enabling more dynamic volume management strategies.
Why make node affinity mutable?
The rationale for this change becomes clear when you consider the fundamental difference between stateless and stateful workloads. Deployments can be modified and rolled out seamlessly through Pod recreation, but PersistentVolumes carry state that can't be discarded without data loss.
Storage technology doesn't stand still. Providers now commonly offer regional disks, with some supporting live migration from zonal to regional configurations without workload interruption. While the VolumeAttributesClass API (GA in v1.34) can express these storage-side changes, Kubernetes would still restrict Pod scheduling to the original zone based on the PV's node affinity. With mutable node affinity, you can update the constraint from:
spec:
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- us-east1-b
to:
spec:
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/region
operator: In
values:
- us-east1
Hardware generations present another scenario. Newer disk types may be incompatible with older nodes, a constraint expressed through PV node affinity to ensure correct Pod placement. When upgrading disks, you'll want to prevent scheduling to incompatible nodes by updating the affinity from:
spec:
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: provider.com/disktype.gen1
operator: In
values:
- available
to:
spec:
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: provider.com/disktype.gen2
operator: In
values:
- available
While removing a single API server validation might seem straightforward, full integration with the Kubernetes ecosystem requires substantial additional work.
Try it out
This feature targets cluster administrators whose storage providers support online updates that affect volume accessibility.
Modifying PV node affinity doesn't alter the underlying volume's actual accessibility. You must first update the volume through your storage provider and verify which nodes can access it post-update. Only then should you enable this feature to synchronize the PV node affinity.
As an alpha feature, it's disabled by default and subject to change. Enable the MutablePVNodeAffinity feature gate on the API server to edit the spec.nodeAffinity field. Since PV editing typically requires administrator privileges, verify your RBAC permissions before proceeding.
Race condition between updating and scheduling
PV node affinity is among the few external factors influencing Pod scheduling decisions. Relaxing node affinity to permit more nodes is safe, but tightening it introduces a race condition: the scheduler's cache may not immediately reflect the modified PV, creating a window where Pods could be placed on nodes that can no longer access the volume. These Pods will remain stuck in ContainerCreating state.
A proposed mitigation—having the kubelet fail Pod startup when PV node affinity is violated—remains under discussion. Until implemented, monitor any Pods using updated PVs to ensure they're scheduled on compatible nodes. Scripted workflows that update PVs and immediately launch Pods may not behave as expected.
Future integration with CSI (Container Storage Interface)
Currently, administrators must manually coordinate changes between PV node affinity and the storage provider. This manual approach is both error-prone and inefficient. The goal is eventual integration with VolumeAttributesClass, allowing unprivileged users to modify their PersistentVolumeClaim to trigger storage updates, with PV node affinity updating automatically without administrator intervention.
We welcome your feedback from users and storage driver developers
This represents just the initial step.
If you're a Kubernetes user, share how you currently use—or plan to use—PV node affinity. Would online updates benefit your workflows?
If you develop CSI drivers, would you implement this feature? What API design would work best for your needs?
Provide feedback through:
- Slack channel #sig-storage
- Mailing list kubernetes-sig-storage
- The KEP issue Mutable PersistentVolume Node Affinity
For specific questions about this feature, contact the SIG Storage community.