ceph - sepia - 2024-08-22

Timestamp (UTC)	Message
2024-08-22T00:07:49.904Z	<Dan Mick> the first iscsi daemon to be updated failed. I can't find its logs. currently combing through systemcalls from adding strace to the invocation. upgrade paused.
2024-08-22T00:37:41.643Z	<Dan Mick> terrific. looks like teuthology is offline
2024-08-22T00:42:44.383Z	<nehaojha> @Xiubo Li @Ilya Dryomov this needs your attention
2024-08-22T00:49:06.564Z	<Dan Mick> one of three daemons is down. I don't know why that should have brought the service down but it appears to have done
2024-08-22T00:53:31.940Z	<Dan Mick> oh, maybe because the target only apparently tries to connect to reesi002
2024-08-22T00:55:42.402Z	<Xiubo Li> I couldn't access the link
2024-08-22T00:59:53.989Z	<Dan Mick> what link
2024-08-22T01:00:41.574Z	<Xiubo Li> Isn't this issue is based <https://console-openshift-console.apps.os.sepia.ceph.com/monitoring/#/alerts?receiver=%23sepia> ?
2024-08-22T01:01:34.489Z	<Dan Mick> I don't know what that is
2024-08-22T01:03:16.357Z	<Dan Mick> I was upgrading the LRC and the first iscsi host failed. I can't find its logs. I can't figure out why it's dying. It looks as though RHEV, which holds teuthology etc, is unable to access the iscsi target because it appears to only know about reesi002, which is the host that has the failed iscsi service. I'm trying to learn something about iscsiadm to try to talk the RHEV host into accessing one of the gateway nodes that's still up.
2024-08-22T01:03:47.526Z	<Dan Mick> if you could figure out what is going wrong on reesi002 with its iscsi service that woudl probably help
2024-08-22T01:05:54.722Z	<Xiubo Li> That means you couldn't find any ceph-iscsi and tcmu-runner logs, right ?
2024-08-22T01:06:52.686Z	<Xiubo Li> if they are not save in the dedicated log files, the should be saved in the host journal logs.
2024-08-22T01:07:03.048Z	<Dan Mick> ceph-28f7427e-5558-4ffd-ae1a-51ec3042759a@iscsi.iscsi.reesi002.hjvvyq
2024-08-22T01:07:49.956Z	<Dan Mick> root@reesi002:/var/log/ceph/28f7427e-5558-4ffd-ae1a-51ec3042759a# ls iscsi ls: cannot access 'iscsi': No such file or directory
2024-08-22T01:07:57.103Z	<Dan Mick> I don't understand where the logs are supposed to be, probably
2024-08-22T01:08:49.690Z	<Xiubo Li> as I remembered the log file will be together with ceph
2024-08-22T01:09:41.952Z	<Xiubo Li> Could you list all the contents from the ceph log directory ?
2024-08-22T01:09:56.128Z	<Dan Mick> do you have access to reesi002?
2024-08-22T01:10:11.968Z	<Xiubo Li> I think no
2024-08-22T01:11:45.948Z	<Dan Mick> so I see things in the strace of the podman run that say, for instance, write(2, "debug No available network portal for target with iqn of 'iqn.2003-01.com.redhat.iscsi-gw:lrc-iscsi1'\n"
2024-08-22T01:12:05.415Z	<Dan Mick> but I can't find the string 'iqn' in any file in /var/log/ceph/2*
2024-08-22T01:13:54.956Z	<Dan Mick> oh there it is, in journalctl, under the name conmon
2024-08-22T01:15:10.744Z	<Dan Mick> `Aug 22 00:00:18 reesi002 conmon[1694028]: debug Setting up iqn.2003-01.com.redha t.iscsi-gw:lrc-iscsi1` `Aug 22 00:00:18 reesi002 conmon[1694028]: debug (Gateway.create_tpg) created TPG '1' for target iqn 'iqn.2003-01.com.redhat.iscsi-gw:lrc-iscsi1'` `Aug 22 00:00:18 reesi002 conmon[1694028]: debug (Gateway.create_tpg) created TPG '2' for target iqn 'iqn.2003-01.com.redhat.iscsi-gw:lrc-iscsi1'` `Aug 22 00:00:18 reesi002 conmon[1694028]: debug (Gateway.create_tpg) created TPG '3' for target iqn 'iqn.2003-01.com.redhat.iscsi-gw:lrc-iscsi1'` `Aug 22 00:00:18 reesi002 conmon[1694028]: debug (Gateway.create_target) created an iscsi target with iqn of 'iqn.2003-01.com.redhat.iscsi-gw:lrc-iscsi1'` `Aug 22 00:00:19 reesi002 conmon[1694028]: debug [iqn.2003-01.com](http://iqn.2003-01.com).redhat.iscsi-gw: lrc-iscsi1 - Could not define LUNs: Unable to register lrc/lrc_vol with LIO: fai led to add lrc/lrc_vol to LIO - error([Errno 2] No such file or directory)` `Aug 22 00:00:19 reesi002 conmon[1694028]: debug No available network portal for target with iqn of 'iqn.2003-01.com.redhat.iscsi-gw:lrc-iscsi1'`
2024-08-22T01:16:32.676Z	<Xiubo Li> It seems the config corrupted ?
2024-08-22T01:18:16.081Z	<Dan Mick> Where would the config be?
2024-08-22T01:19:35.685Z	<Xiubo Li> should be in `gateway.conf` object
2024-08-22T01:20:41.609Z	<Xiubo Li> you can try to save it locally and then clear it and have a try
2024-08-22T01:27:03.687Z	<Dan Mick> gateway.conf object....do you mean a rados object?
2024-08-22T01:31:36.135Z	<Dan Mick> @Xiubo Li I don't know how to proceed here
2024-08-22T01:34:27.037Z	<Dan Mick> ok I did rados -p iscsi-config get gateway.conf /tmp/gw.conf
2024-08-22T01:35:28.119Z	<Dan Mick> it mentions disks lrc/lrc-vol and lrc/lrc_vol1 (note dash on on and underscore on the other)
2024-08-22T01:35:56.125Z	<Dan Mick> one of the logs above mentions lrc/lrc_vol
2024-08-22T01:36:03.576Z	<Dan Mick> could this be the issue?
2024-08-22T01:36:08.227Z	<Xiubo Li> Yeah, the rados object
2024-08-22T01:36:23.506Z	<Dan Mick> I am learning every bit of this as I go
2024-08-22T01:36:43.835Z	<Dan Mick> do you think lrc-vol vs lrc_vol is a problem?
2024-08-22T01:37:08.591Z	<Dan Mick> # rbd ls -p lrc lrc-vol lrc_vol lrc_vol1 lrc_vol2
2024-08-22T01:37:13.610Z	<Dan Mick> ugh
2024-08-22T01:37:38.070Z	<Xiubo Li> no, before we hit this just because the gateway.conf is corrupted.
2024-08-22T01:38:05.513Z	<Xiubo Li> you can try to clear it and have a try to see if it's the issue
2024-08-22T01:38:41.764Z	<Dan Mick> do you mean you have seen this issue before and the cause at that prior time was that gateway.conf was corrupted?
2024-08-22T01:39:09.177Z	<Xiubo Li> yeah,
2024-08-22T01:39:23.545Z	<Xiubo Li> so please just try and confirm it
2024-08-22T01:39:28.725Z	<Dan Mick> the file I extracted it to looks valid
2024-08-22T01:39:45.828Z	<Xiubo Li> before clearing it just save it locally and later you can recover it
2024-08-22T01:40:50.586Z	<Dan Mick> but....you mean it's okay to read it by rados get, but it's corrupted somehow in how the iscsi processes are accessing it?
2024-08-22T01:41:11.318Z	<Dan Mick> because the extracted file looks fine
2024-08-22T01:41:26.068Z	<Xiubo Li> Did you check the contents of it ?
2024-08-22T01:41:37.712Z	<Dan Mick> the file I extracted it to looks valid
2024-08-22T01:41:44.060Z	<Dan Mick> because the extracted file looks fine
2024-08-22T01:42:01.589Z	<Xiubo Li> I meant the contents of the object
2024-08-22T01:42:07.649Z	<Xiubo Li> not the object itself in rados
2024-08-22T01:42:13.878Z	<Dan Mick> okay let me say this one more time:
2024-08-22T01:42:31.283Z	<Dan Mick> I did rados -p iscsi-config get gateway.conf /tmp/gw.conf
2024-08-22T01:42:48.201Z	<Dan Mick> /tmp/gw.conf does not look damaged
2024-08-22T01:43:00.871Z	<Xiubo Li> could you share the file
2024-08-22T01:46:24.790Z	<Dan Mick> DMed
2024-08-22T01:47:14.821Z	<Dan Mick> could I use gwcli or something to validate the file?
2024-08-22T01:47:21.931Z	<badone> Pulpito down too?
2024-08-22T01:48:04.038Z	<yuriw> 502 Bad Gateway 🤷‍♂️🏼
2024-08-22T01:48:14.386Z	<Dan Mick> likely everything
2024-08-22T01:57:43.528Z	<Xiubo Li> We should edit and remove all the ``lrc/lrc-vol` configs and then store it back
2024-08-22T01:57:58.126Z	<Xiubo Li> the `cleints` config was lsot
2024-08-22T02:07:22.545Z	<Dan Mick> That doesn't sound safe to me. This is supposed to be the configuration safe storage, in the cluster, right?
2024-08-22T02:08:52.174Z	<Xiubo Li> Yeah, this will lost the LUNs
2024-08-22T02:09:23.201Z	<Xiubo Li> BTW, do you have any save about the `gwcli ls` output before ?
2024-08-22T02:09:42.663Z	<Xiubo Li> BTW, do you have any save about the `gwcli ls` output before ?
2024-08-22T02:09:53.307Z	<Dan Mick> I don't. I don't know if any of the others who messed with the config do, or if @Adam Kraitman maybe does
2024-08-22T02:09:58.564Z	<Xiubo Li> We need to use that info to recovery the config
2024-08-22T02:11:11.445Z	<Dan Mick> so merely updating the version of the iscsi daemon corrupted the configuration inside the RADOS clusteR?
2024-08-22T02:11:32.733Z	<Dan Mick> in the very-specific way of removing some critical client information?
2024-08-22T02:12:07.539Z	<Dan Mick> there are two other gateways still on the old version and still apparently running; can we get any info from them?
2024-08-22T02:13:08.109Z	<Dan Mick> # ceph orch ls --service-name iscsi.iscsi NAME PORTS RUNNING REFRESHED AGE PLACEMENT iscsi.iscsi ?:5000 2/3 3m ago 8M reesi002;reesi004;reesi005
2024-08-22T02:13:15.399Z	<Xiubo Li> Yeah, it should be corrupted when updating by incorrectly handling or something else.
2024-08-22T02:13:27.699Z	<Xiubo Li> Please share the gwcli output from other gateways.
2024-08-22T02:13:37.055Z	<Dan Mick> how do I run it
2024-08-22T02:13:43.147Z	<Xiubo Li> Let's see could we find something needed
2024-08-22T02:14:39.503Z	<Xiubo Li> `gwcli ls`
2024-08-22T02:14:53.016Z	<Dan Mick> gwcli isn't installed on the base system; I assume it's in the container
2024-08-22T02:15:41.211Z	<Xiubo Li> yeah
2024-08-22T02:17:11.215Z	<Dan Mick> appears to be hanging
2024-08-22T02:21:02.164Z	<Dan Mick> okay, just really slow. let me run it without ^C to make sure it's complete
2024-08-22T02:38:43.181Z	<Dan Mick> got the output, sent in a PM
2024-08-22T03:48:14.548Z	<Dan Mick> status update: @Xiubo Li is driving some experiments. We are not certain which of several rbd images, or multiples, were used as backing store for the iscsi gateway. We suspect, somehow, the gateway.conf, which is stored in the cluster, has lost some of it critical configuration, and I do not know where, if anywhere, there is a backup. So we're going to try to preserve the existing config and edit it to something that makes sense and see if that restores access. Hopefully @Adam Kraitman will be available soon and may have some stored info
2024-08-22T03:49:59.266Z	<Dan Mick> @Xiubo Li: could the gateway use more than one RBD image for a particular target?
2024-08-22T03:51:43.093Z	<Dan Mick> rbd info shows that the only one of the four that's been accessed today was lrc_vol, so maybe the other ones are just dead
2024-08-22T03:52:14.529Z	<Dan Mick> it was also the last one created (Jan 4 2023)
2024-08-22T03:56:13.816Z	<Dan Mick> so that makes me feel better about the idea that it's the right and only one
2024-08-22T03:56:44.459Z	<Dan Mick> is gateway.conf updated by the daemon when it stops? or is there some manual maintenance step that should have been done to make it reflect the current configuration?
2024-08-22T03:59:08.016Z	<Dan Mick> rados shows the mtime of gateway.conf to be 2022-12-28
2024-08-22T04:02:20.352Z	<Xiubo Li> > @Xiubo Li: could the gateway use more than one RBD image for a particular target? Yeah
2024-08-22T04:50:53.294Z	<jcollin> ssh: connect to host [teuthology.front.sepia.ceph.com](http://teuthology.front.sepia.ceph.com) port 22: No route to host
2024-08-22T05:01:58.857Z	<Dan Mick> yes, teuthology is still dwon
2024-08-22T06:15:32.091Z	<Ronen Friedman> If it helps with the analysis: o10.front.* cannot resolve [github.com](http://github.com), but can ping its IP: `$ nslookup` `> [github.com](http://github.com)` `;; communications error to 127.0.0.53#53: timed out` `;; communications error to 127.0.0.53#53: timed out` `^C` `$ ping 140.82.121.3` `PING 140.82.121.3 (140.82.121.3) 56(84) bytes of data.` `64 bytes from 140.82.121.3: icmp_seq=1 ttl=50 time=93.7 ms` `64 bytes from 140.82.121.3: icmp_seq=2 ttl=50 time=93.7 ms` `64 bytes from 140.82.121.3: icmp_seq=3 ttl=50 time=93.8 ms` `64 bytes from 140.82.121.3: icmp_seq=4 ttl=50 time=93.7 ms` `64 bytes from 140.82.121.3: icmp_seq=5 ttl=50 time=93.8 ms` `^C` `--- 140.82.121.3 ping statistics ---` `5 packets transmitted, 5 received, 0% packet loss, time 4008ms` `rtt min/avg/max/mdev = 93.672/93.737/93.821/0.052 ms`
2024-08-22T06:27:47.329Z	<badone> dns server down I guess?
2024-08-22T06:42:01.810Z	<Ronen Friedman> or the network to it. But it's back now!
2024-08-22T06:56:43.661Z	<Dan Mick> Postmortem on the LRC upgrade snafu:
2024-08-22T06:56:58.053Z	<Dan Mick> (I've recorded this in a file as well and will find a better home)
2024-08-22T06:57:50.443Z	<Dan Mick> reesi002's iscsi container wouldn't start after update. After some poking around, @Xiubo Li Xiubo and I decided that: 1) although the lrc pool has four different rbd images in it, it seems likely (from ctime/mtime on the images and other clues in configs) that only one of them, lrc_vol, is likely to be the one behind a target. 2) the gateway.conf object exists in at least two places, the iscsi-config pool and the lrc pool. The latter looks later and probably correct. (Also confirmed by the fact that iscsi-gateway.cfg has 'pool=lrc', which will make the tools search for gateway.conf in lrc.) 3) iscsi-gateway.cfg, mounted to /etc/ceph when the container starts, ought to have all the gateway IPs and the mgr IPs in its trusted_ip_list entry, but only has ivan02 (currently active manager). I don't know why this would be...perhaps a bad ceph orch apply in the past, when we were thrashing around with the last ceph-iscsi failure? After adding ips to trusted_ip_list by editing the /var/lib/ceph copy of iscsi-gateway.cfg, the container still wouldn't start, complaining with Aug 22 05:15:48 reesi002 conmon[1953522]: /usr/lib/python3.9/site-packages/rtslib_fb/root.py:180: UserWarning: Cannot set dbroot to /var/target. Target devices have already been registered. Aug 22 05:15:48 reesi002 conmon[1953522]: warn("Cannot set dbroot to {}. Target devices have already been registered." our best idea for fixing that was to reboot the host to clear the kernel state (I don't know if there's a less-invasive maneuver there) but it seems to have worked; the service is finally back up. Now that the storage is back, I can get at the wiki page that describes the iscsi configuration (<http://wiki.front.sepia.ceph.com/doku.php?id=services:longrunningcluster&s[]=iscsi>) and I see that "pool=lrc" is a key entry. Remaining cleanup: 1) remove the obsolete rbd images in the lrc pool (if they're obsolete; they seem to be) 2) figure out why iscsi-gateway.cfg's trusted_ip_list didn't include the hosts in the ceph-iscsi service deployment, and if they need to be there (maybe, with api_secure: false as the deployment instructions say, it's not necessary to have trusted_ip_list) 3) write this configuration info down somewhere that doesn't depend on the configuration being operational, and make sure it's complete enough to be used by someone who's not a Ceph iscsi expert
2024-08-22T06:58:59.761Z	<Dan Mick> I've left the upgrade paused; I'll be out tomorrow and can continue after some investigation on Friday, perhaps.
2024-08-22T07:18:25.950Z	<badone> Bring on NVMEoF 😉
2024-08-22T07:20:16.302Z	<Sunil Angadi> Hi team, In my PR build `centos` distro getting failed <https://shaman.ceph.com/builds/ceph/wip-sangadi1-testing-2024-08-21-1217/> ```make[2]: * [Makefile:25: build] Error 1 make[2]: Leaving directory '/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos9/DIST/centos9/MACHINE_SIZE/gigantic/ceph-container/staging/main-centos-9-x86_64/daemon-base' make[1]: * [Makefile:73: do.image.x86_64,wip-sangadi1-testing-2024-08-21-1217,centos,9] Error 2 make: *** [Makefile:88: build.parallel] Error 2 Wed Aug 21 02:35:24 PM UTC 2024 :: rm -fr /tmp/install-deps.212605 Build step 'Execute shell' marked build as failure New run name is '#82317 origin/wip-sangadi1-testing-2024-08-21-1217, 1a7d873adb148ef0f04c1dc2537be4f7316c94f6, centos9, crimson'``` so when i try to run `run-make-check.sh ` it’s asking for password, can somebody pls help me to resolve this issue? ```sangadi@teuthology:~/wip-sangadi1-testing$ ./run-make-check.sh Checking hostname sanity... OK [sudo] password for sangadi: Sorry, try again. [sudo] password for sangadi: sudo: 1 incorrect password attempt```
2024-08-22T07:20:29.001Z	<Sunil Angadi> Hi team, In my PR build `centos` distro getting failed <https://shaman.ceph.com/builds/ceph/wip-sangadi1-testing-2024-08-21-1217/> ```make[2]: * [Makefile:25: build] Error 1 make[2]: Leaving directory '/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos9/DIST/centos9/MACHINE_SIZE/gigantic/ceph-container/staging/main-centos-9-x86_64/daemon-base' make[1]: * [Makefile:73: do.image.x86_64,wip-sangadi1-testing-2024-08-21-1217,centos,9] Error 2 make: *** [Makefile:88: build.parallel] Error 2 Wed Aug 21 02:35:24 PM UTC 2024 :: rm -fr /tmp/install-deps.212605 Build step 'Execute shell' marked build as failure New run name is '#82317 origin/wip-sangadi1-testing-2024-08-21-1217, 1a7d873adb148ef0f04c1dc2537be4f7316c94f6, centos9, crimson'``` so when i try to run `run-make-check.sh` it’s asking for password, can somebody pls help me to resolve this issue? ```sangadi@teuthology:~/wip-sangadi1-testing$ ./run-make-check.sh Checking hostname sanity... OK [sudo] password for sangadi: Sorry, try again. [sudo] password for sangadi: sudo: 1 incorrect password attempt```
2024-08-22T07:20:40.165Z	<Sunil Angadi> Hi team, In my PR build `centos` distro getting failed <https://shaman.ceph.com/builds/ceph/wip-sangadi1-testing-2024-08-21-1217/> ```make[2]: * [Makefile:25: build] Error 1 make[2]: Leaving directory '/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos9/DIST/centos9/MACHINE_SIZE/gigantic/ceph-container/staging/main-centos-9-x86_64/daemon-base' make[1]: * [Makefile:73: do.image.x86_64,wip-sangadi1-testing-2024-08-21-1217,centos,9] Error 2 make: *** [Makefile:88: build.parallel] Error 2 Wed Aug 21 02:35:24 PM UTC 2024 :: rm -fr /tmp/install-deps.212605 Build step 'Execute shell' marked build as failure New run name is '#82317 origin/wip-sangadi1-testing-2024-08-21-1217, 1a7d873adb148ef0f04c1dc2537be4f7316c94f6, centos9, crimson'``` so when i try to run `run-make-check.sh` it’s asking for password, can somebody pls help me to resolve this issue? ```sangadi@teuthology:~/wip-sangadi1-testing$ ./run-make-check.sh Checking hostname sanity... OK [sudo] password for sangadi: Sorry, try again. [sudo] password for sangadi: sudo: 1 incorrect password attempt```
2024-08-22T07:52:55.514Z	<Dan Mick> Do NOT use teuthology to build Ceph. You don't have sudo there for a reason; it's a massively shared resource. Learn about where and how you can build Ceph.
2024-08-22T07:53:15.207Z	<Dan Mick> As for the build failure, diagnose it by reading more of the log.
2024-08-22T08:49:06.482Z	<Sunil Angadi> ok thanks @Dan Mick
2024-08-22T09:00:57.341Z	<Kyrylo Shatskyy> @Dan Mick basically anyone may use teuthology to build Ceph, but on their own instance and own lab 😄 just don't come to Sepia 😄
2024-08-22T13:51:28.403Z	<Yaarit> @Adam Kraitman @Dan Mick `[telemetry.front.sepia.ceph.com](http://telemetry.front.sepia.ceph.com)` is down, can you please take a look?
2024-08-22T16:08:42.643Z	<Laura Flores> @Yaarit this is likely fallout from the LRC upgrade: <https://ceph-storage.slack.com/archives/C1HFJ4VTN/p1724309870421749>
2024-08-22T16:12:03.041Z	<Dan Mick> Yes, I thought it was obvious that when I say teuthology I mean the host in the sepia lab. Especially since the teuthology software is not a build tool.
2024-08-22T16:20:57.578Z	<Ken Dreyer> Hi folks, I plan to attend the infra meeting at <https://meet.jit.si/ceph-infra> in 10 minutes
2024-08-22T17:17:01.578Z	<Dan Mick> oh, and 4) explain why we have three iscsi gateways, but one of them failing is enough to kill the service
2024-08-22T17:17:10.676Z	<Dan Mick> oh, and 4) explain why we have three iscsi gateways, but one of them failing is enough to kill the service (and resolve that)
2024-08-22T18:49:15.979Z	<Yaarit> Can someone try restarting the vm please? I do not have access to RHEV. Not sure it will bring the vm up, but it's worth trying
2024-08-22T18:58:44.805Z	<Casey Bodley> see lots of container builds under <https://shaman.ceph.com/builds/ceph/> failing: > ```+ docker push [quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph:wip-cbodley-testing-d476034-centos-stream9-x86_64-devel](http://quay-quay-quay.apps.os.sepia.ceph.com/ceph-ci/ceph:wip-cbodley-testing-d476034-centos-stream9-x86_64-devel) > Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg. > Getting image source signatures > Copying blob sha256:c7eb4dd50a8c1da224a8562eb680303e4f9736a0f44be591424a3441f7e2f479 > Copying blob sha256:fa9a39020e334c1df2b18f1d420d6d23f775b328d7e90eed3d70ddffaadb8d8a > Error: writing blob: uploading layer chunked: received unexpected HTTP status: 500 Internal Server Error```
2024-08-22T18:59:24.738Z	<Casey Bodley> probably LRC-related?
2024-08-22T19:27:52.064Z	<Laura Flores> Yeah, Vallari mentioned that [quay.ceph.io](http://quay.ceph.io) was down at the Infra call, so likely related
2024-08-22T22:17:35.373Z	<Dan Mick> telemetry.front and telemetry-public are both running, and I can ssh to telemetry.front
2024-08-22T22:18:23.560Z	<Dan Mick> [quay.ceph.io](http://quay.ceph.io) seems back now

ceph - sepia - 2024-08-22

Any issue? please create an issue here and use the infra label.