ceph - cephadm - 2024-10-02

Timestamp (UTC)Message
2024-10-02T15:14:36.225Z
<verdurin> A host that I put into `maintenance` for some firmware updates is now showing as `offline`.
It fails `check-host` but I am able to SSH to it interactively, and the procedure that `cephadm` recommends in the error message i.e.:
```To check that the host is reachable open a new shell with the --no-hosts flag:
> cephadm shell --no-hosts

Then run the following:
> ceph cephadm get-ssh-config > ssh_config
> ceph config-key get mgr/cephadm/ssh_identity_key > ~/cephadm_private_key
> chmod 0600 ~/cephadm_private_key
> ssh -F ssh_config -i ~/cephadm_private_key centos@<host ip>```
does work.

I tried `ceph mgr fail` - this didn't make any difference.

We're running `16.2.15`.
2024-10-02T15:15:13.563Z
<verdurin> I also tried `ceph orch host set-addr` and that command failed.
2024-10-02T16:45:12.644Z
<verdurin> Seems similar to <https://tracker.ceph.com/issues/51629> though in our case all the hosts do have an associated IP address.
2024-10-02T17:26:48.124Z
<Adam King> Do you get anything from `ceph log last 200 cephadm` if you run it directly after running `ceph cephadm check-host <hostname>` ? It might print a traceback that could help with debugging.

Any issue? please create an issue here and use the infra label.