ceph - ceph-devel - 2024-08-22

Timestamp (UTC)	Message
2024-08-22T09:05:40.648Z	<baum> Hello, everyone 🖖 posted a build, shaman looks green: <https://shaman.ceph.com/builds/ceph/wip-baum-main-202258fa-20240822-nvmeof-enabled/54f078595c3da6930404c7a4c94641dc1f26e798/>, however pulling corresponding image from quay results in `HTTP 502` ```baum@ceph-nvme-build-server-2 ~ $ docker pull [quay.ceph.io/ceph-ci/ceph:54f078595c3da6930404c7a4c94641dc1f26e798](http://quay.ceph.io/ceph-ci/ceph:54f078595c3da6930404c7a4c94641dc1f26e798) 54f078595c3da6930404c7a4c94641dc1f26e798: Pulling from ceph-ci/ceph e8b54c863393: Retrying in 1 second 23b73fa7bf3d: Retrying in 1 second error pulling image configuration: download failed after attempts=6: received unexpected HTTP status: 502 Bad Gateway``` This response is different from pulling incorrect tag which returns `manifest unknown` ```baum@ceph-nvme-build-server-2 ~ $ docker pull [quay.ceph.io/ceph-ci/ceph:wrong_tag](http://quay.ceph.io/ceph-ci/ceph:wrong_tag) Error response from daemon: manifest for [quay.ceph.io/ceph-ci/ceph:wrong_tag](http://quay.ceph.io/ceph-ci/ceph:wrong_tag) not found: manifest unknown: manifest unknown``` any idea? thank you 🙏
2024-08-22T12:11:56.308Z	<baum> Just noticed another pair - shaman <https://shaman.ceph.com/builds/ceph/wip-sjust-nvmeof-testing-2024-08-21/884868d026099a5a4a0a07bd5b5cb2396f64ec96/> ```$ docker pull [quay.ceph.io/ceph-ci/ceph:884868d026099a5a4a0a07bd5b5cb2396f64ec96](http://quay.ceph.io/ceph-ci/ceph:884868d026099a5a4a0a07bd5b5cb2396f64ec96) 884868d026099a5a4a0a07bd5b5cb2396f64ec96: Pulling from ceph-ci/ceph e8b54c863393: Retrying in 1 second b2d32b67ac1b: Retrying in 1 second error pulling image configuration: download failed after attempts=6: received unexpected HTTP status: 502 Bad Gateway```
2024-08-22T13:23:14.195Z	<Casey Bodley> anyone have experience with mimalloc? sounds promising: <https://github.com/microsoft/mimalloc?tab=readme-ov-file#performance>
2024-08-22T13:23:19.982Z	<Casey Bodley> > In our benchmark suite, _mimalloc_ outperforms other leading allocators (_jemalloc_, _tcmalloc_, _Hoard_, etc), and has a similar memory footprint.
2024-08-22T13:43:32.883Z	<John Mulligan> is anyone looking into frequent `api` (as in `jenkins test api`) failures? I've tried rebasing my PR a few times hoping the problem would just go away, but so far no such luck. Are the failures intermittent and I'm just getting "bad RNG"?
2024-08-22T13:45:25.367Z	<Æmerson> Which failure are you getting? I've been noticing compilation failures in `condition_variable::wait` in `fair_mutex`.
2024-08-22T13:45:32.356Z	<Æmerson> (I blame Clang14.)
2024-08-22T13:46:55.243Z	<John Mulligan> <https://jenkins.ceph.com/job/ceph-api/80225/console> an example. I don't know how to read this test output FWIW. I don't see any actual test failures, but I could be missing something.
2024-08-22T18:12:34.496Z	<Æmerson> I have a question: Since as I understand it, blkin was added as part of our lttng-ust efforts, does it make any sense for us to still be using lttng-ust, instead of turning the tracepoints into OpenTelemetry tracepoints?
2024-08-22T18:13:01.377Z	<Æmerson> And, would either of you be averse to a ceph-tracing channel?
2024-08-22T18:13:09.794Z	<Æmerson> (I hate Slack threads so if we can get it in a channel, all the better.)
2024-08-22T18:16:11.602Z	<Casey Bodley> afaik the lttng stuff under ceph/src/tracing is unrelated to blkin
2024-08-22T18:16:34.208Z	<Casey Bodley> but should probably be unified with opentracing stuff
2024-08-22T18:16:55.155Z	<Æmerson> Really? I thought it was the motivation for blkin. Maybe was just tracing generally.
2024-08-22T18:16:57.495Z	<Æmerson> All right.
2024-08-22T18:18:32.177Z	<Æmerson> Does anyone actually use the lttng-ust tracing? I'd like to unify it with OpenTelemetry, if nobody objects.
2024-08-22T19:37:36.300Z	<Josh Durgin> +1, don't think it is used much if at all due to not being compiled by default, and replacing it with opentelemetry in most cases makes a lot of sense
2024-08-22T19:59:58.986Z	<Ivveh> is there any way to inject a different exporter port for nfs ganesha? looks very static to me: <https://github.com/ceph/ceph/blob/3e4664a73cc3e07a91921a0a29eecacb7676b150/src/pybind/mgr/cephadm/services/nfs.py#L25>
2024-08-22T20:01:28.825Z	<Ivveh> or is there a reason behind this? or a way to disable the exporter all in all
2024-08-22T20:02:07.719Z	<Ivveh> make it difficult to deploy more than one instance
2024-08-22T20:02:21.292Z	<Ivveh> makes it difficult to deploy more than one instance
2024-08-22T20:05:43.308Z	<Ivveh> would be nice with a spec definition like `monitoring_port` that injects that to the ganesha.conf
2024-08-22T20:13:04.791Z	<Ivveh> <https://github.com/nfs-ganesha/nfs-ganesha/blob/V5.5/src/config_samples/config.txt#L68>
2024-08-22T20:13:32.630Z	<Ivveh> sorry first link was to something random, i meant to paste from 18.2.4

ceph - ceph-devel - 2024-08-22

Any issue? please create an issue here and use the infra label.