ceph - cephadm - 2024-11-22

Timestamp (UTC)Message
2024-11-22T15:47:35.744Z
<verdurin> To what extent is the bootstrap host on a Cephadm cluster **special**?

I've been working my way through the nodes, upgrading the OS then re-adding them to the cluster.
2024-11-22T15:48:01.415Z
<verdurin> To what extent is the bootstrap host on a Cephadm cluster **special**?

I've been working my way through the nodes, upgrading the OS then re-adding them to the cluster.

The final MON node is also the bootstrap node, which has open SSH connections to the other cluster members.
2024-11-22T15:49:25.031Z
<verdurin> Is it as simple as being the current active `MGR`?
2024-11-22T15:49:29.998Z
<Ken Carlile> From what I recall doing something similar, it's not. at all. If you've been using it as a management node, I'd save out the bash history and any local files related to what you've been doing. I think I saved the /var/lib/ceph*, but I don't know that I've needed them.
2024-11-22T15:49:34.868Z
<Brian P> The SSH connections should come from the active ceph-mgr
2024-11-22T15:49:42.779Z
<verdurin> And if I force the `MGR` role to move elsewhere, that's sufficient?
2024-11-22T15:50:32.641Z
<verdurin> Okay, that's easier than I thought, then.
Will certainly save the voluminous shell history, and associated files as you say.
2024-11-22T15:50:57.288Z
<Brian P> Should be.

If you are asking if the initial bootstrap node can be decommissioned, the answer is yes.
Make sure to keep access to the cluster, that's all.
2024-11-22T15:50:59.003Z
<Ken Carlile> ymmv, no warranty or guarantee, blah blah blah. And I've only been using ceph for about 3 months. 😄
2024-11-22T15:51:24.339Z
<verdurin> Yes, I know...
2024-11-22T15:53:23.026Z
<verdurin> Thanks both.
2024-11-22T15:54:07.386Z
<Ken Carlile> pretty sure the _admin label takes care of a lot of it for you.
2024-11-22T15:54:10.908Z
<verdurin> The complication is that the bootstrap node in this case is the first MON.
With the other MONs, it was straightforward.
2024-11-22T15:54:59.973Z
<Ken Carlile> yeah, mine was that way too
2024-11-22T15:56:38.548Z
<BrianP> That is always the case, primeval mon has to be somewhere and cephadm has to run there.
I assume you are comparing it with Ansible, where you can run it from anywhere.
2024-11-22T16:00:15.599Z
<verdurin> Have forced a `MGR` failover and that's worked, so I can upgrade the node.
Will wait until Monday, though...
2024-11-22T16:01:06.432Z
<Ken Carlile> woot!
2024-11-22T16:15:17.342Z
<Matthew Vernon> I've tested reimaging (which wipes everything on the OS disks) the bootstrap node, and it Just Worked.
2024-11-22T16:58:33.126Z
<verdurin> @Eugen Block you were helping me with a weird stray host a few months ago. The `MGR` failover mentioned above led to the disappearance of the error...
2024-11-22T17:25:39.002Z
<Eugen Block> Yeah, since a year or two my first suggestion is always to fail the mgr before anything else. 😄 

Any issue? please create an issue here and use the infra label.