can not delete OpenStack Volumes after ceph upgrade to Luminous

Recently we've upgraded our Ceph Cluster for OpenStack to Luminous. Everything seemed to be fine. But when deleting obsolete cinder volumes, we've noticed that their state changes from "active" to "deleting" for ~30 seconds only to go back into "active" state, instead of beeing removed. Cinder logs showing following lines:

WARNING cinder.volume.drivers.rbd [...] ImageBusy error raised while deleting rbd volume. This may have been caused by a connection from a client that has crashed and, if so, may be resolved by retrying the delete after 30 seconds has elapsed. ERROR cinder.volume.manager [...] Unable to delete busy volume.

"Usualy" this happens when a rbd snapshot is taken bypassing cinder or some instance is still watching this particular disk.

rbd snapshots:

:~# rbd -p cinder snap ls volume-730501ce-cf52-4888-a0de-597f587904b1
:~# rbd -p cinder-cache snap ls volume-730501ce-cf52-4888-a0de-597f587904b1
rbdmapped:
:~# rbd -p cinder status volume-730501ce-cf52-4888-a0de-597f587904b1
Watchers: none
:~# rbd -p cinder-cache status volume-730501ce-cf52-4888-a0de-597f587904b1
Watchers: none

Both scenarios didn't match in our case. Instead these Volumes were still locked besides the coresponding instance was deleted "long time ago":

:~# rbd -p cinder lock list volume-730501ce-cf52-4888-a0de-597f587904b1
There is 1 exclusive lock on this image.
Locker           ID                   Address
client.529785369 auto 140589714861088 10.80.8.101:0/863421692
:~# rbd -p cinder-cache lock list volume-730501ce-cf52-4888-a0de-597f587904b1
There is 1 exclusive lock on this image.
Locker           ID                   Address
client.529785369 auto 140589714861088 10.80.8.101:0/863421692

As it turnes out since our Ceph upgrade to luminous the cinder-user has missing permissions:

:~# rbd -p cinder --id cinder lock rm volume-730501ce-cf52-4888-a0de-597f587904b1 "auto 140589714861088" client.529785369
rbd: releasing lock failed: (13) Permission denied

Permissions:

client.cinder
    key: **********************==
    caps: [mgr] allow r
    caps: [mon] allow r
    caps: [osd] allow class-read object_prefix rbd_children, allow rwx pool=cinder, allow rwx pool=glance, allow rx pool=cinder-backup

to make openstack volume delete great again - simply update the ceph auth caps for your cinder user to "rbd profiles":

/usr/bin/ceph auth caps client.cinder mon 'allow profile rbd' mgr 'allow profile rbd' osd 'allow class-read object_prefix rbd_children, allow rwx pool=cinder, allow rwx pool=glance, allow rx pool=cinder-backup'

client.cinder
    key: **********************==
    caps: [mgr] allow profile rbd
    caps: [mon] allow profile rbd
    caps: [osd] allow class-read object_prefix rbd_children, allow rwx pool=cinder, allow rwx pool=glance, allow rx pool=cinder-backup

Last update: April 18, 2020