ceph - cephadm - 2024-06-11

Timestamp (UTC)Message
2024-06-11T15:38:35.091Z
<Raghu> Just for a closure on this issue, we did hit the same issue  as <https://tracker.ceph.com/issues/57096>
once we added the --no-cgroups-split option during the bootstrap, it fixed the issue for us.
2024-06-11T18:28:06.645Z
<Raghu> Hello
We are running in to issues adding a new mon in to a cephadm cluster.
we tried to add the machine using the following command
```ceph orch host add host2 10.99.88.77 mon```
The above does finish successfully.  This is the second machine that we are trying to add in to the cluster. The mon never comes up in the cluster and we can see the following messages in the logs.
```2024-06-11T18:02:31.918+0000 7f8f2d09d700  0 log_channel(cephadm) log [INF] : Filtered out host host2: does not belong to mon public_network(s):  10.99.0.0/16, host network(s): 10.11.22.0/26,10.11.22.64/26


sudo ceph config dump  | egrep -i network
global        advanced  cluster_network                        10.99.0.0/16    *
global        advanced  public_network                         10.99.0.0/16    *```
We are currently using bird to run BGP on our physical machines.
This is how our sample configuration looks like :
```ip route ls
default proto bird src 10.99.88.77 metric 32
    nexthop via 10.11.22.1 dev ens3f0 weight 1
    nexthop via 10.11.22.65 dev ens3f1 weight 1
default via 10.11.22.1 dev ens3f0 proto static metric 100
default via 10.11.22.1 dev ens3f0 proto dhcp src 10.11.22.44 metric 100
default via 10.11.22.65 dev ens3f1 proto static metric 101
default via 10.11.22.65 dev ens3f1 proto dhcp src 10.11.22.107 metric 101
10.11.22.0/26 dev ens3f0 proto kernel scope link src 10.11.22.44 metric 100
10.11.22.64/26 dev ens3f1 proto kernel scope link src 10.11.22.107 metric 101
192.168.16.1 dev synack scope link```
when we run cephadm list-networks , we see the following output.
```{
    "10.11.22.0/26": {
        "ens3f0": [
            "10.11.22.44"
        ]
    },
    "10.11.22.64/26": {
        "ens3f1": [
            "10.11.22.107"
        ]
    }
}```
Seems like cephadm does only select the physical NIC interface ip address, but it does not select the logical interface on the machine.
Can some one please guide on what we need to here ?
2024-06-11T18:44:33.551Z
<John Mulligan> There's very good chance that this will need a code change and can not be trivially worked around. As you can see the networks list doesn't know anything about this virtual device.
2024-06-11T18:44:45.631Z
<John Mulligan> If you haven't already I suggest filing a tracker issue.
2024-06-11T18:46:13.354Z
<John Mulligan> Actually, please look to see if this has been reported already. THEN if not, file a new issue.
2024-06-11T18:50:31.984Z
<John Mulligan> <https://tracker.ceph.com/issues/53496>  perhaps?
2024-06-11T18:51:57.991Z
<John Mulligan> <https://tracker.ceph.com/issues/62195> I mean
2024-06-11T18:52:20.489Z
<John Mulligan> <s><https://tracker.ceph.com/issues/53496>  perhaps?</s>
2024-06-11T18:52:44.140Z
<John Mulligan> Ignore please~. <https://tracker.ceph.com/issues/53496>  perhaps?~
2024-06-11T19:03:11.019Z
<Raghu> The above tracker is for IPv6 , should we create another tracker for IPv4 addresses as well  ?
2024-06-11T19:04:20.254Z
<John Mulligan> I don't know. I had never heard of BIRD before today. I just tried to find something simlar in the tracker. If you don't feel it is sufficiently similar, file a new issue, I suppose. They can always get marked as a dup later.
2024-06-11T19:04:39.130Z
<Raghu> ok
2024-06-11T19:06:18.825Z
<Raghu> on a general note when we try to add a machine to the cluster, we do specify the IP address along with the name. Why would ceph be not using this information to configure the node ? why does the code has to detect new networks and use them for communication between the nodes.
2024-06-11T19:07:26.892Z
<Adam King> FWIW, I think you can work around this for now by "manually" placing the mon daemons after setting the mon service to be unmanaged (<https://docs.ceph.com/en/latest/cephadm/services/mon/#deploying-monitors-on-a-particular-network> has some related info)
2024-06-11T19:09:32.654Z
<Adam King> We can't fully use the host ip for everything cephadm does on a host. Management of the ingress service and its VIPs for example tends to require more network information than the single IP. Typically the full networks info is a superset of that single IP anyway so it isn't an issue. It's just in this case we're effectively missing a feature to detect BGP networks
2024-06-11T19:11:00.635Z
<Adam King> Note if you do follow the docs I linked above, stop at the "Now, enable automatic placement of Daemons", that will put the filtering back into effect
2024-06-11T19:35:00.963Z
<Raghu> the following commands are exactly the ones which i already ran before.
```sudo ceph orch apply mon --unmanaged

ceph orch host add  host1 10.99.88.77 mon```
My understanding is that if i dont run the following command, mon would never get deployed
```
ceph orch apply mon --placement="host1"```
2024-06-11T19:35:43.628Z
<Raghu> the following commands are exactly the ones which i already ran before.
```sudo ceph orch apply mon --unmanaged

ceph orch host add  host1 10.99.88.77 mon```
My understanding is that if i dont run the following command, mon would never get deployed
```ceph orch apply mon --placement="host1"```
2024-06-11T19:36:12.138Z
<Raghu> the following commands are exactly the ones which i already ran before.
```After the bootstrap process is complete on one node.
sudo ceph orch apply mon --unmanaged

ceph orch host add  host1 10.99.88.77 mon```
My understanding is that if i dont run the following command, mon would never get deployed
```ceph orch apply mon --placement="host1"```
2024-06-11T19:38:33.948Z
<Adam King> No, the apply command overwrites past apply commands (service specs are declarative). The reason to run the last one would be to go back to automatic placement based on the service spec placement. That won't work in your case due to the network filtering. The docs section as a whole was written for cases where the hosts aren't being filtered, but you just want to use a different network for the mon daemons than cephadm uses by default. Your case is a bit different and re-enabling the automatic deployment with the second apply command will cause issues.
2024-06-11T20:17:33.973Z
<Raghu> if i run just the host add command, the only thing i can see that is deployed to the new machines is crash containers and nothing else.
no error messages in the logs either.
2024-06-11T20:18:30.419Z
<Raghu> if i run just the host add command for mon instance and without the apply option, the only thing i can see that is deployed to the new machines is crash containers and nothing else.
no error messages in the logs either.
2024-06-11T20:19:20.114Z
<Adam King> if you mean something like `ceph orch host add host2 10.99.88.77 mon` then that's the command to add a host, not a mon daemon. You'd want `ceph orch daemon add mon *<host1:ip-or-network1>`

Any issue? please create an issue here and use the infra label.