ceph - crimson - 2024-11-18

Timestamp (UTC)Message
2024-11-18T01:56:25.709Z
<Yingxin Cheng> This is to minimize the read amplification to 1x (fio-rbd read 4K -> disk read 4K)
2024-11-18T11:26:40.903Z
<Jose J Palacios-Perez> Here is a quick comparison between default (unbalanced) vs balanced (preliminary allocation where reactors are distributed across NUMA sockets). It seems that the initial  balanced approach performs slightly worse than no balancing. I will be testing the latest code changes to balance in terms of OSD instead soon: https://files.slack.com/files-pri/T1HG3J90S-F081P82U42V/download/cyan_3osd_3react_bal_vs_unbal_4krandread_iops_vs_lat.png
2024-11-18T11:36:32.068Z
<Matan Breizman> Is this Bluestore or cyanstore?
2024-11-18T11:36:40.464Z
<Jose J Palacios-Perez> cyanstore
2024-11-18T11:37:07.636Z
<Matan Breizman> might be Cynastore symptom as well
2024-11-18T14:02:07.484Z
<Jose J Palacios-Perez> Something is odd during the core allocation: going step by step:
• creating the cluster with the new option:
```# MDS=0 MON=1 OSD=3 MGR=1 /ceph/src/vstart.sh --new -x --localhost --without-dashboard --cyanstore --redirect-output --crimson --crimson-smp 3 --no-restart --crimson-balance-cpu```
looking at the debug output: here is the correct invocation of ceph conf for each OSD
```/ceph/build/bin/ceph -c /ceph/build/ceph.conf config set osd.0 crimson_seastar_cpu_cores 0-2
INFO  2024-11-18 13:09:09,130 [shard 0:main] osd - get_early_config: set --thread-affinity 1 --cpuset 0-2

/ceph/build/bin/ceph -c /ceph/build/ceph.conf config set osd.1 crimson_seastar_cpu_cores 28-30
DEBUG 2024-11-18 13:09:10,411 [shard 0:main] monc - set_mon_vals crimson_seastar_cpu_cores = 28-30

/ceph/build/bin/ceph -c /ceph/build/ceph.conf config set osd.2 crimson_seastar_cpu_cores 3-5
DEBUG 2024-11-18 13:09:12,064 [shard 0:main] monc - set_mon_vals crimson_alien_thread_cpu_cores = 9-111
DEBUG 2024-11-18 13:09:12,064 [shard 0:main] monc - set_mon_vals crimson_seastar_cpu_cores = 3-5
:```
But when validating the thread affinity, notice how some reactors are running on different sockets: all reactors of OSD.0 should be running on socket 0, but clearly reactors 3 to 5 are on socket 1: :thinking_face:: https://files.slack.com/files-pri/T1HG3J90S-F081PU9ET25/download/screenshot_2024-11-18_at_13.55.20.png
2024-11-18T14:10:49.619Z
<Jose J Palacios-Perez> This is the full list, only OSD.0 seems incorrect, I'm going to destroy the cluster, ensure no .pid file is left and try again:: https://files.slack.com/files-pri/T1HG3J90S-F081B9N9WG2/download/screenshot_2024-11-18_at_14.09.42.png
2024-11-18T14:43:50.527Z
<Jose J Palacios-Perez> Found the bug: it should ensure to remove the old *_threads.out if already existing:: https://files.slack.com/files-pri/T1HG3J90S-F0818KUKC2Z/download/screenshot_2024-11-18_at_14.33.33.png
2024-11-18T14:45:30.957Z
<Jose J Palacios-Perez> This is from a fresh config, and definitely correct:: https://files.slack.com/files-pri/T1HG3J90S-F0818L5JUPP/download/screenshot_2024-11-18_at_14.45.00.png
2024-11-18T22:29:53.182Z
<Jose J Palacios-Perez> Getting closer, the id is the last OSD because was intended to be cummulative, I'll refactor it to fix it, but this shows the distribution is now balanced in terms of OSD: https://files.slack.com/files-pri/T1HG3J90S-F081DUJ0HCK/download/screenshot_2024-11-18_at_22.28.07.png
2024-11-18T23:09:46.135Z
<Jose J Palacios-Perez> Fixed :smiley:: https://files.slack.com/files-pri/T1HG3J90S-F080ZKR7AKH/download/screenshot_2024-11-18_at_23.09.14.png

Any issue? please create an issue here and use the infra label.