2024-07-18T00:59:32.460Z | <Xiubo Li> Hi Hutch, Please see <https://tracker.ceph.com/issues/50719?#note-38> |
2024-07-18T01:00:29.084Z | <Xiubo Li> If you could reproduce it then please enable both the mds and kernel debug logs and then try to reproduce it. |
2024-07-18T04:49:07.720Z | <jcollin> @Dhairya Parmar Why there's a DNM <https://github.com/ceph/ceph/pull/56887> ? If you could resolve it, I could QA it today. |
2024-07-18T05:57:43.832Z | <Venky Shankar> Hey devs, RFR - <https://github.com/ceph/ceph/pull/58645> |
2024-07-18T07:37:48.560Z | <jcollin> This PR skipped today's QA run. |
2024-07-18T08:55:16.900Z | <Dhairya Parmar> hmm patrick had added it and i have no idea why |
2024-07-18T08:55:48.302Z | <Dhairya Parmar> oh okay |
2024-07-18T08:55:56.852Z | <Dhairya Parmar> DNM can be removed now |
2024-07-18T09:23:03.687Z | <jcollin> okay, Please do the needful. It will get picked up in the next QA batch. |
2024-07-18T11:32:03.818Z | <jcollin> @Venky Shankar Do we need to cherry-pick all commits in <https://github.com/ceph/ceph/pull/58632/commits> to downstream RHCS7.1z2? Or just the Negative Seconds commit is enough? |
2024-07-18T11:32:46.114Z | <Venky Shankar> you can leave out the qa change |
2024-07-18T11:32:58.871Z | <Venky Shankar> but generally we backport everything |
2024-07-18T11:33:06.749Z | <jcollin> ok |
2024-07-18T13:21:59.026Z | <Rishabh Dave> @Venky Shankar I've added the details of the issue we discussed in the standup, PTAL - <https://github.com/ceph/ceph/pull/54620#issuecomment-2236487221>. |
2024-07-18T13:30:28.895Z | <Rishabh Dave> (edited the comment a bit just now) |
2024-07-18T16:46:31.388Z | <Hutch> yes, the main thing that is causing this issue is specificly when you use SMB ontop of the ceph filesystem. if you are not using SMB the issue will not happen. |
2024-07-18T16:47:01.548Z | <Hutch> I have a testing environment setup here at 45drives if you would like to take this offline and have remote access let me know |
2024-07-18T17:52:56.289Z | <Hutch> i just also updated the bug report. please let me know if you need anything |
2024-07-18T18:08:40.001Z | <Erich Weiler> Hi @Xiubo Li - I was looking at <https://github.com/ceph/ceph/pull/58474> a bit. Is the idea that if `mds_client_delegate_inos_pct` is set to `0` then this issue shouldn’t even happen (did I read that right)? I only ask because I already have `mds_client_delegate_inos_pct` set to `0` and I still see the lock issue happening. Just wanted to double check before we got too far in the process! |
2024-07-18T18:48:31.492Z | <Dhairya Parmar> Hey wes, how about listing all objects of a pool using `rados -p <pool> ls` , this will tell you which objects have are the bulkiest (and maybe finding omapvals for each of those objects using `rados -p <pool> omapvals <obj>` for more). Once you have the object you can then make use of the `ceph-objectstore-tool` to find the content of the object, this is in the docs which goes like this:
```ceph-objectstore-tool --data-path PATH_TO_OSD --pgid PG_ID OBJECT list-omap```
<https://docs.ceph.com/en/pacific/man/8/ceph-objectstore-tool/#listing-the-object-map> |
2024-07-18T18:48:50.187Z | <Dhairya Parmar> or maybe give this a shot <https://ceph.io/en/news/blog/2015/get-omap-keyvalue-size/> |