can not delete OpenStack Volumes after ceph upgrade to Luminous
Recently we've upgraded our OpenStack Ceph Cluster to Luminous.
Everything seemed to be fine... But when deleting obsolete cinder volumes,
we've noticed that their state changes from active
to deleting
for about 30 seconds
only to go back into active
state, instead of being removed.
Analyzing the problem¶
Cinder logs showed following lines:
cinder.log
WARNING cinder.volume.drivers.rbd [...] ImageBusy error raised while deleting rbd volume. This may have been caused by a connection from a client that has crashed and, if so, may be resolved by retrying the delete after 30 seconds has elapsed.
ERROR cinder.volume.manager [...] Unable to delete busy volume.
"Usually" this happens when a rbd snapshot is taken, bypassing cinder, or some instance is still watching this particular disk.
list rbd snapshots
:~$ rbd -p cinder snap ls volume-730501ce-cf52-4888-a0de-597f587904b1
:~$ rbd -p cinder-cache snap ls volume-730501ce-cf52-4888-a0de-597f587904b1
get rbdmapped
:~$ rbd -p cinder status volume-730501ce-cf52-4888-a0de-597f587904b1
Watchers: none
:~$ rbd -p cinder-cache status volume-730501ce-cf52-4888-a0de-597f587904b1
Watchers: none
Permission issues¶
Both scenarios didn't match in our case. Instead these Volumes were still locked besides the corresponding instance was deleted "long time ago":
get rbd locks
:~$ rbd -p cinder lock list volume-730501ce-cf52-4888-a0de-597f587904b1
There is 1 exclusive lock on this image.
Locker ID Address
client.529785369 auto 140589714861088 10.80.8.101:0/863421692
:~$ rbd -p cinder-cache lock list volume-730501ce-cf52-4888-a0de-597f587904b1
There is 1 exclusive lock on this image.
Locker ID Address
client.529785369 auto 140589714861088 10.80.8.101:0/863421692
As it turns out: Since our Luminous upgrade the cinder-user
has missing permissions:
Permission denied
:~$ rbd -p cinder --id cinder lock rm volume-730501ce-cf52-4888-a0de-597f587904b1 "auto 140589714861088" client.529785369
rbd: releasing lock failed: (13) Permission denied
old permissions
client.cinder
key: **********************==
caps: [mgr] allow r
caps: [mon] allow r
caps: [osd] allow class-read object_prefix rbd_children, allow rwx pool=cinder, allow rwx pool=glance, allow rx pool=cinder-backup
Fixing Permissions¶
To make openstack volume delete great again - simply update the ceph auth caps
for your cinder user to rbd profiles
:
/usr/bin/ceph auth caps client.cinder \
mon 'allow profile rbd' \
mgr 'allow profile rbd' \
osd 'allow class-read object_prefix rbd_children, allow rwx pool=cinder, allow rwx pool=glance, allow rx pool=cinder-backup'
client.cinder
key: **********************==
caps: [mgr] allow profile rbd
caps: [mon] allow profile rbd
caps: [osd] allow class-read object_prefix rbd_children, allow rwx pool=cinder, allow rwx pool=glance, allow rx pool=cinder-backup