ceph - cephadm - 2024-11-22

Timestamp (UTC)	Message
2024-11-22T15:47:35.744Z	<verdurin> To what extent is the bootstrap host on a Cephadm cluster special? I've been working my way through the nodes, upgrading the OS then re-adding them to the cluster.
2024-11-22T15:48:01.415Z	<verdurin> To what extent is the bootstrap host on a Cephadm cluster special? I've been working my way through the nodes, upgrading the OS then re-adding them to the cluster. The final MON node is also the bootstrap node, which has open SSH connections to the other cluster members.
2024-11-22T15:49:25.031Z	<verdurin> Is it as simple as being the current active `MGR`?
2024-11-22T15:49:29.998Z	<Ken Carlile> From what I recall doing something similar, it's not. at all. If you've been using it as a management node, I'd save out the bash history and any local files related to what you've been doing. I think I saved the /var/lib/ceph*, but I don't know that I've needed them.
2024-11-22T15:49:34.868Z	<Brian P> The SSH connections should come from the active ceph-mgr
2024-11-22T15:49:42.779Z	<verdurin> And if I force the `MGR` role to move elsewhere, that's sufficient?
2024-11-22T15:50:32.641Z	<verdurin> Okay, that's easier than I thought, then. Will certainly save the voluminous shell history, and associated files as you say.
2024-11-22T15:50:57.288Z	<Brian P> Should be. If you are asking if the initial bootstrap node can be decommissioned, the answer is yes. Make sure to keep access to the cluster, that's all.
2024-11-22T15:50:59.003Z	<Ken Carlile> ymmv, no warranty or guarantee, blah blah blah. And I've only been using ceph for about 3 months. 😄
2024-11-22T15:51:24.339Z	<verdurin> Yes, I know...
2024-11-22T15:53:23.026Z	<verdurin> Thanks both.
2024-11-22T15:54:07.386Z	<Ken Carlile> pretty sure the _admin label takes care of a lot of it for you.
2024-11-22T15:54:10.908Z	<verdurin> The complication is that the bootstrap node in this case is the first MON. With the other MONs, it was straightforward.
2024-11-22T15:54:59.973Z	<Ken Carlile> yeah, mine was that way too
2024-11-22T15:56:38.548Z	<BrianP> That is always the case, primeval mon has to be somewhere and cephadm has to run there. I assume you are comparing it with Ansible, where you can run it from anywhere.
2024-11-22T16:00:15.599Z	<verdurin> Have forced a `MGR` failover and that's worked, so I can upgrade the node. Will wait until Monday, though...
2024-11-22T16:01:06.432Z	<Ken Carlile> woot!
2024-11-22T16:15:17.342Z	<Matthew Vernon> I've tested reimaging (which wipes everything on the OS disks) the bootstrap node, and it Just Worked.
2024-11-22T16:58:33.126Z	<verdurin> @Eugen Block you were helping me with a weird stray host a few months ago. The `MGR` failover mentioned above led to the disappearance of the error...
2024-11-22T17:25:39.002Z	<Eugen Block> Yeah, since a year or two my first suggestion is always to fail the mgr before anything else. 😄

ceph - cephadm - 2024-11-22

Any issue? please create an issue here and use the infra label.