ceph - cephfs - 2024-06-14

Timestamp (UTC)Message
2024-06-14T09:54:09.142Z
<Jos Collin> @Venky Shankar Could you please approve these: <https://github.com/ceph/ceph/pull/57762>, <https://github.com/ceph/ceph/pull/57760>. No errors found in squid run: <https://tracker.ceph.com/issues/66423>
2024-06-14T10:01:29.087Z
<Rishabh Dave> @Venky Shankar <https://github.com/ceph/ceph/pull/54620#pullrequestreview-2117728922>  I agree with the problem you described in this review comment but solution you proposed is unclear to me. The "simplistic fix" you've described has already been implemented.
2024-06-14T10:02:44.245Z
<Rishabh Dave> Copying "sophisticated fix" part here -

>  For a more sophisticated fix, `_get_info_for_all_clones` could include the clone status and then here we select clones (and aggregate sizes) based on the status.
Clone status? How would we get current number of cloner threads allowed through clone status?
2024-06-14T10:03:17.146Z
<Venky Shankar> no need to get the number of threads
2024-06-14T10:03:25.442Z
<Venky Shankar> with each clone entry get its clone status
2024-06-14T10:03:38.095Z
<Venky Shankar> it would in either in-progress or pending or canceled
2024-06-14T10:03:48.413Z
<Venky Shankar> then whatever is in-progress aggregate size
2024-06-14T10:04:05.240Z
<Rishabh Dave> ah, okay. count it right then and there.
2024-06-14T10:04:17.202Z
<Venky Shankar> then -- for the rest aggregate the size + (aggregated size for in-progress)
2024-06-14T10:04:23.712Z
<Venky Shankar> yeh
2024-06-14T10:04:35.917Z
<Venky Shankar> no need to rely on max_concurrent_clones
2024-06-14T10:04:57.471Z
<Rishabh Dave> got git
2024-06-14T10:04:59.559Z
<Rishabh Dave> got it*
2024-06-14T10:09:15.827Z
<Rishabh Dave> we'll be adding the code to make progress reporter thread wait for new clone jobs; see <https://github.com/ceph/ceph/pull/54620>.

in the same way, can we implement this on a separate PR?
2024-06-14T10:09:40.474Z
<Venky Shankar> yeh, I'm fine with that
2024-06-14T10:09:46.916Z
<Venky Shankar> let's do the simplistic fix for now
2024-06-14T10:09:51.458Z
<Rishabh Dave> cool.
2024-06-14T10:12:32.916Z
<Rishabh Dave> the current code we on the PR for "simplistic fix", is it okay in your opinion? or does it need some improvements?
2024-06-14T10:14:24.944Z
<Venky Shankar> i'll review once more
2024-06-14T10:14:36.272Z
<Venky Shankar> but its almost there 🙂
2024-06-14T10:14:42.995Z
<Rishabh Dave> okay
2024-06-14T10:15:11.112Z
<Rishabh Dave> i am making rest of requested changes and testing with vstart_runner in the mean time
2024-06-14T11:09:28.070Z
<Jos Collin> @Venky Shankar Does this need rados suite run also? <https://github.com/ceph/ceph/pull/57840>
2024-06-14T11:10:01.435Z
<Venky Shankar> No
2024-06-14T11:12:12.486Z
<Venky Shankar> will do
2024-06-14T11:16:10.823Z
<Jos Collin> okay
2024-06-14T11:17:44.665Z
<Jos Collin> @Rishabh Dave You have the ignorelist `MDS_CACHE_OVERSIZED`, but still this test fails? <https://pulpito.ceph.com/leonidus-2024-06-12_09:41:32-fs-wip-lusov-testing-20240611.123850-squid-distro-default-smithi/7751944/>
2024-06-14T12:10:29.904Z
<Rishabh Dave> tests passed locally, i've pushed to this PR. PTAL.
2024-06-14T12:10:53.853Z
<Rishabh Dave> i'll go through old review commits and check if anything is left.
2024-06-14T12:15:51.240Z
<Rishabh Dave> added where?
2024-06-14T12:17:53.852Z
<Rishabh Dave> are you talking about this?
2024-06-14T12:17:55.632Z
<Rishabh Dave> <https://github.com/ceph/ceph/pull/57840/files#diff-0ec2f98005b3a59e058ad5748d70a283d4cd8258de53e445e6815b4eca599339R8>
2024-06-14T12:18:59.700Z
<Rishabh Dave> it's added to ignorelist for `fs/functional/tasks/admin.yaml`
2024-06-14T12:20:11.587Z
<Rishabh Dave> which is unrelated to the failing job. it's not `fs:workload`, not `fs:functional`.
2024-06-14T22:38:26.738Z
<Bailey Allison> hey everyone, currently waiting for account approval on bug tracker but currently running into this <https://tracker.ceph.com/issues/64852>, just looking at the nodes to confirm issue is currently on kernel side ? would switching to fuse from kernel provide any benefit for this ? I am seeing the same logs as this too

Any issue? please create an issue here and use the infra label.