ceph - cephfs - 2024-06-17

Timestamp (UTC)Message
2024-06-17T04:21:26.656Z
<Venky Shankar> Hey devs - RFR <https://github.com/ceph/ceph/pull/57991>
2024-06-17T04:23:22.103Z
<Venky Shankar> That's right, Bailey. Switching to ceph-fuse would result in _some_ perf drop.
2024-06-17T04:23:46.702Z
<Venky Shankar> The kernel patches could take sometime to land in your specific distro kernel.
2024-06-17T05:00:29.794Z
<Jos Collin> @Venky Shankar Could you please ack? <https://github.com/ceph/ceph/pull/57678#pullrequestreview-2118198504>
2024-06-17T06:40:07.439Z
<Venky Shankar> I'll have to check the failed run Jos.
2024-06-17T09:10:41.019Z
<Jos Collin> ok, I'll check
2024-06-17T09:13:49.313Z
<Jos Collin> @Leonid Usov @Venky Shankar Could you please approve these PRs. <https://github.com/ceph/ceph/pull/57763>, <https://github.com/ceph/ceph/pull/57441>, <https://github.com/ceph/ceph/pull/57437>. No Related failures in <https://pulpito.ceph.com/?branch=wip-lusov-testing-20240613.155007-reef>.
2024-06-17T10:08:24.804Z
<Dhairya Parmar> @Venky Shankar I/O calls can only make up < 4GiB of write on each call and the problem lies in `buffer.h`  and not the libcephfs
2024-06-17T10:08:43.060Z
<Dhairya Parmar> should this be enhanced or the limit is fine?
2024-06-17T10:09:29.684Z
<Dhairya Parmar> should this be enhanced or the add a check to limit before it leads to overflow?
2024-06-17T10:09:37.331Z
<Dhairya Parmar> should this be enhanced or add a check to limit before it leads to overflow?
2024-06-17T11:47:04.396Z
<Xiubo Li> `src/include/buffer.h` ?
2024-06-17T11:47:21.542Z
<Xiubo Li> Is this one ?
2024-06-17T11:49:58.976Z
<Rishabh Dave> @Venky Shankar In context to this comment <https://github.com/ceph/ceph/pull/54620#discussion_r1637695939>, `dst_path` should be path to subvolume UUID, so that `rsize` doesn't include snapshots of the subvolume?
2024-06-17T11:51:16.742Z
<Rishabh Dave> Right now it is path till directory for the subvol. Example: `/volumes/_nogroup/ss1clone1`
2024-06-17T12:11:47.351Z
<Bailey Allison> hey venky, thanks for getting back to me, I was more so curious if fuse would be a way to get around this issue, however for this we did switch to fuse and were still seeing the same issue
2024-06-17T12:18:27.772Z
<Dhairya Parmar> yup @Xiubo Li
2024-06-17T12:19:47.971Z
<Dhairya Parmar> if you try to write(sync or async) `4GiB`  i.e. `4294967296` bytes then the int overflow wraps the value to `0`  and then there will be undefined behaviour
2024-06-17T12:20:23.416Z
<Dhairya Parmar> i.e. the bufferlist will contain pointer to the iovec structs but the overall length of the list is `0`
2024-06-17T12:20:27.848Z
<Dhairya Parmar> this crashes the call
2024-06-17T12:20:36.652Z
<Xiubo Li> IMO this should be explicitly explained why in `buffer.h` and just fails it directly when calling the API with overflowed size.
2024-06-17T12:20:46.638Z
<Dhairya Parmar> 4294967296
2024-06-17T12:20:55.581Z
<Dhairya Parmar> [4294967296](https://tracker.ceph.com/issues/66245#note-4)
2024-06-17T12:20:56.290Z
<Dhairya Parmar> [4294967296](https://tracker.ceph.com/issues/66245#note-4)
2024-06-17T12:21:01.757Z
<Dhairya Parmar> <https://tracker.ceph.com/issues/66245#note-4>
2024-06-17T12:21:07.398Z
<Dhairya Parmar> RCAed here ^^
2024-06-17T12:23:11.273Z
<Xiubo Li> As my understanding, it should be fixed `buffer.h` API first when it will overflow it should fail or at least print some warnings ?
2024-06-17T12:25:08.877Z
<Dhairya Parmar> IMO this should be fixed. Personally i don't think warnings could help, it could lead to potential data corruptions. A better way to counter this would be return EINVAL
2024-06-17T12:25:37.322Z
<Dhairya Parmar> but a long term fix is obviously de-limiting `buffer.h`
2024-06-17T12:26:17.821Z
<Xiubo Li> maybe just assert if overflowed
2024-06-17T12:26:50.914Z
<Xiubo Li> it's better return with failure to caller
2024-06-17T12:27:02.339Z
<Xiubo Li> and let caller know what has happened
2024-06-17T12:28:11.305Z
<Dhairya Parmar> im concerned about cases where it is necessary to write huge buffer(>4GiB)
2024-06-17T12:28:20.912Z
<Xiubo Li> It's strange that overflowed and doesn't give any hint. It's a bug needed to be fix anyway IMO.
2024-06-17T12:29:04.330Z
<Dhairya Parmar> it wont because the return type here is unsigned int, and unlike signed int, it doesn't throw undefined behavior, it just wraps the value up with modulo
2024-06-17T12:29:28.322Z
<Dhairya Parmar> (return type of bufferlist's funcs)
2024-06-17T13:42:52.636Z
<Rishabh Dave> @Venky Shankar ping
2024-06-17T13:58:20.013Z
<Venky Shankar> I think yes
2024-06-17T13:58:28.740Z
<Venky Shankar> else we might be overaccounting
2024-06-17T13:58:31.377Z
<Venky Shankar> cc @Kotresh H R
2024-06-17T14:57:06.535Z
<Jos Collin> @Venky Shankar @Leonid Usov ping
2024-06-17T15:14:17.528Z
<Jos Collin> Thanks @Venky Shankar. Merged.
2024-06-17T22:59:03.492Z
<alvaro.soto> has renamed the channel from "cephfs" to "ceph-cephfs"
2024-06-17T23:03:46.676Z
<khyr0n> (sorry about the noise, doing some reconfig)

Any issue? please create an issue here and use the infra label.