Skip to content

can not delete OpenStack Volumes after ceph upgrade to Luminous

Recently we've upgraded our OpenStack Ceph Cluster to Luminous.

Everything seemed to be fine... But when deleting obsolete cinder volumes, we've noticed that their state changes from active to deleting for about 30 seconds only to go back into active state, instead of being removed.

Analyzing the problem

Cinder logs showed following lines:

cinder.log

WARNING cinder.volume.drivers.rbd [...] ImageBusy error raised while deleting rbd volume. This may have been caused by a connection from a client that has crashed and, if so, may be resolved by retrying the delete after 30 seconds has elapsed.
ERROR cinder.volume.manager [...] Unable to delete busy volume.

"Usually" this happens when a rbd snapshot is taken, bypassing cinder, or some instance is still watching this particular disk.

list rbd snapshots

:~$ rbd -p cinder snap ls volume-730501ce-cf52-4888-a0de-597f587904b1
:~$ rbd -p cinder-cache snap ls volume-730501ce-cf52-4888-a0de-597f587904b1

get rbdmapped

:~$ rbd -p cinder status volume-730501ce-cf52-4888-a0de-597f587904b1
Watchers: none
:~$ rbd -p cinder-cache status volume-730501ce-cf52-4888-a0de-597f587904b1
Watchers: none

Permission issues

Both scenarios didn't match in our case. Instead these Volumes were still locked besides the corresponding instance was deleted "long time ago":

get rbd locks

:~$ rbd -p cinder lock list volume-730501ce-cf52-4888-a0de-597f587904b1
There is 1 exclusive lock on this image.
Locker           ID                   Address
client.529785369 auto 140589714861088 10.80.8.101:0/863421692
:~$ rbd -p cinder-cache lock list volume-730501ce-cf52-4888-a0de-597f587904b1
There is 1 exclusive lock on this image.
Locker           ID                   Address
client.529785369 auto 140589714861088 10.80.8.101:0/863421692

As it turns out: Since our Luminous upgrade the cinder-user has missing permissions:

Permission denied

:~$ rbd -p cinder --id cinder lock rm volume-730501ce-cf52-4888-a0de-597f587904b1 "auto 140589714861088" client.529785369
rbd: releasing lock failed: (13) Permission denied

old permissions

client.cinder
    key: **********************==
    caps: [mgr] allow r
    caps: [mon] allow r
    caps: [osd] allow class-read object_prefix rbd_children, allow rwx pool=cinder, allow rwx pool=glance, allow rx pool=cinder-backup

Fixing Permissions

To make openstack volume delete great again - simply update the ceph auth caps for your cinder user to rbd profiles:

/usr/bin/ceph auth caps client.cinder \
  mon 'allow profile rbd' \
  mgr 'allow profile rbd' \
  osd 'allow class-read object_prefix rbd_children, allow rwx pool=cinder, allow rwx pool=glance, allow rx pool=cinder-backup'
client.cinder
    key: **********************==
    caps: [mgr] allow profile rbd
    caps: [mon] allow profile rbd
    caps: [osd] allow class-read object_prefix rbd_children, allow rwx pool=cinder, allow rwx pool=glance, allow rx pool=cinder-backup

Last update: March 22, 2021

Comments