ceph - ceph-devel - 2024-06-27

Timestamp (UTC)Message
2024-06-27T06:44:45.208Z
<Vikram Kumar Vaidyam> I am encountering an issue with space reclamation when using Rook-Ceph for dynamic storage provisioning in a Kubernetes cluster. Here are the details:
**Environment Details:**
• **Storage Classes Used**:
    ◦ `standard-rwo` (provisioner: `[rook-ceph.rbd.csi.ceph.com](http://rook-ceph.rbd.csi.ceph.com)`)
    ◦ `rook-cephfs` (provisioner: `[rook-ceph.cephfs.csi.ceph.com](http://rook-ceph.cephfs.csi.ceph.com)`)
**Issue Description:**
We use the `standard-rwo` storage class with `[rook-ceph.rbd.csi.ceph.com](http://rook-ceph.rbd.csi.ceph.com)` provisioner. The class allows for volume expansion, and increasing the Persistent Volume (PV) size works correctly. However, when we attempt to shrink or delete a PV, the space is not reclaimed from Ceph.
**Steps Taken:**
1. Increased PV size to ~1TB successfully.
2. Deleted the PV to reclaim the space.
3. Observed that the space was not reclaimed in Ceph.
**Observations:**
**Ceph Status Commands**:
• $ ceph osd pool stats
• $ ceph df
• Both commands indicate that the space used by the deleted PV is not being reclaimed.

and tried to check orphaned objects, but i get a permission error while doing so,
`rgw-orphan-list` gives me this error
```Available pools:
/usr/bin/rgw-orphan-list: line 39: ./lspools-20240625053651.error: Permission denied
Pool is "An error was encountered while running 'rados lspools'. Aborting.
Review file './lspools-20240625053651.error' for details.
***
*** WARNING: The results are incomplete. Do not use! ***
***".
Note: output files produced will be tagged with the current timestamp -- 20240625053651.
running 'rados ls' at Tue Jun 25 05:36:51 UTC 2024
running 'rados ls' on pool An.
/usr/bin/rgw-orphan-list: line 66: ./rados-20240625053651.intermediate: Permission denied
An error was encountered while running 'rados ls'. Aborting.
Review file './rados-20240625053651.error' for details.
***
*** WARNING: The results are incomplete. Do not use! ***
***```
2024-06-27T11:33:41.843Z
<Mehmet> Hello ceph guys 🙂,

1. I have a 3 Node Cluster where i have added a 4th Node without any OSDs.
2. i want to remove a few osds from the already existing 3 Nodes and put this in the 4th node
i would set this disks as "out" and then "put it in the 4th node" and enable/activate the said OSDs there.

Is there a better way to deal with this?
2024-06-27T12:15:34.800Z
<Mehmet> I would start now with (disks are 3,5TB)
```Node 1
ceph osd crush reweight osd.26 3.0
ceph osd crush reweight osd.24 3.0

Node 2
ceph osd crush reweight osd.27 3.0
ceph osd crush reweight osd.25 3.0

Node 3
ceph osd crush reweight osd.31 3.0
ceph osd crush reweight osd.28 3.0```
and further down to "0" with keep an eye on the usage (actualy at ~88%).

When these disks are on 0 i would take them out and move to the new node and start them there.

Or is there a better way? 🤔
2024-06-27T12:16:01.420Z
<Mehmet> oops... sry, this is the devel list 😕 will post it in the common chat
2024-06-27T12:39:12.358Z
<Pedro Gonzalez Gomez> Hey @Jason Burris, did you get to fix it? If not, which Ceph version are you running?
2024-06-27T13:59:32.401Z
<Joseph Mundackal> <https://github.com/ceph/ceph/pull/57271> - are we good to merge this? rados approved qa + a few coded review approvals as well
2024-06-27T15:36:55.486Z
<Casey Bodley> fyi, i've just merged <https://github.com/ceph/ceph/pull/57581> to bump our boost version to 1.85. if this causes any issues, please let me or @aemerson know
2024-06-27T15:40:14.828Z
<nehaojha> Yes, @yuriw will merge it later today 
2024-06-27T15:42:24.389Z
<yuriw> @Aishwarya Mathuria ^
2024-06-27T15:54:14.582Z
<yuriw> @Joseph Mundackal merged
2024-06-27T16:26:20.086Z
<Jason Burris> I did get around it eventually after re-running ninja without cleaning up the build path first.  Still not sure what the cause was.
2024-06-27T16:48:02.977Z
<Joseph Mundackal> thanks @yuriw!
2024-06-27T17:00:41.366Z
<Joseph Mundackal> thanks @yuriw!
@Aishwarya Mathuria - that for the qa on it!
2024-06-27T23:39:43.594Z
<yuriw> <https://github.com/ceph/ceph/pull/57589> I need to cherry-pick this PT for squid but it can't pass `make check` , I was told it's a known issue, any clues what's going on and ETA?

Any issue? please create an issue here and use the infra label.