ceph - cephfs - 2024-07-18

Timestamp (UTC)Message
2024-07-18T00:59:32.460Z
<Xiubo Li> Hi Hutch, Please see <https://tracker.ceph.com/issues/50719?#note-38>
2024-07-18T01:00:29.084Z
<Xiubo Li> If you could reproduce it then please enable both the mds and kernel debug logs and then try to reproduce it.
2024-07-18T04:49:07.720Z
<jcollin> @Dhairya Parmar Why there's a DNM <https://github.com/ceph/ceph/pull/56887> ? If you could resolve it, I could QA it today.
2024-07-18T05:57:43.832Z
<Venky Shankar> Hey devs, RFR - <https://github.com/ceph/ceph/pull/58645>
2024-07-18T07:37:48.560Z
<jcollin> This PR skipped today's QA run.
2024-07-18T08:55:16.900Z
<Dhairya Parmar> hmm patrick had added it and i have no idea why
2024-07-18T08:55:48.302Z
<Dhairya Parmar> oh okay
2024-07-18T08:55:56.852Z
<Dhairya Parmar> DNM can be removed now
2024-07-18T09:23:03.687Z
<jcollin> okay, Please do the needful. It will get picked up in the next QA batch.
2024-07-18T11:32:03.818Z
<jcollin> @Venky Shankar Do we need to cherry-pick all commits in <https://github.com/ceph/ceph/pull/58632/commits> to downstream RHCS7.1z2? Or just the Negative Seconds commit is enough?
2024-07-18T11:32:46.114Z
<Venky Shankar> you can leave out the qa change
2024-07-18T11:32:58.871Z
<Venky Shankar> but generally we backport everything
2024-07-18T11:33:06.749Z
<jcollin> ok
2024-07-18T13:21:59.026Z
<Rishabh Dave> @Venky Shankar I've added the details of the issue we discussed in the standup, PTAL - <https://github.com/ceph/ceph/pull/54620#issuecomment-2236487221>.
2024-07-18T13:30:28.895Z
<Rishabh Dave> (edited the comment a bit just now)
2024-07-18T16:46:31.388Z
<Hutch> yes, the main thing that is causing this issue is specificly when you use SMB ontop of the ceph filesystem. if you are not using SMB the issue will not happen.
2024-07-18T16:47:01.548Z
<Hutch> I have a testing environment setup here at 45drives if you would like to take this offline and have remote access let me know
2024-07-18T17:52:56.289Z
<Hutch> i just also updated the bug report. please let me know if you need anything
2024-07-18T18:08:40.001Z
<Erich Weiler> Hi @Xiubo Li - I was looking at <https://github.com/ceph/ceph/pull/58474> a bit.  Is the idea that if `mds_client_delegate_inos_pct` is set to `0` then this issue shouldn’t even happen (did I read that right)?  I only ask because I already have `mds_client_delegate_inos_pct` set to `0` and I still see the lock issue happening.  Just wanted to double check before we got too far in the process!
2024-07-18T18:48:31.492Z
<Dhairya Parmar> Hey wes, how about listing all objects of a pool using `rados -p <pool> ls` , this will tell you which objects have are the bulkiest (and maybe finding omapvals for each of those objects using `rados -p <pool> omapvals <obj>` for more). Once you have the object you can then make use of the `ceph-objectstore-tool`  to find the content of the object, this is in the docs which goes like this:
```ceph-objectstore-tool --data-path PATH_TO_OSD --pgid PG_ID OBJECT list-omap```
<https://docs.ceph.com/en/pacific/man/8/ceph-objectstore-tool/#listing-the-object-map>
2024-07-18T18:48:50.187Z
<Dhairya Parmar> or maybe give this a shot <https://ceph.io/en/news/blog/2015/get-omap-keyvalue-size/>

Any issue? please create an issue here and use the infra label.