ceph - crimson - 2024-11-15

Timestamp (UTC)Message
2024-11-15T07:53:50.989Z
<Yingxin Cheng> Rough performance measurements after implementing seastore partial reads, far from perfect, but might be interesting :) <https://github.com/ceph/ceph/pull/60654#issue-2640092642>
2024-11-15T11:40:47.344Z
<Jose J Palacios-Perez> I need to disable the heuristic (stop criteria for the response latency tests): if the latency std deviation is too high, the test does not increase the IO load, hence ending the main loop. Despite that, 4k random read looks pretty good: the OSD CPU utilisation remains constant, the client FIO (using libRBD) becomes the bottleneck: this is a small (3OSD, 3 seastar reactor per OSD) balanced config:: https://files.slack.com/files-pri/T1HG3J90S-F0811FBCUD9/download/cyan_3osd_3react_bal_1procs_randread_iops_vs_lat_vs_cpu.png
2024-11-15T11:51:20.334Z
<Jose J Palacios-Perez> FIO CPU and MEM utilisation resp.: (I'll coalesce all the msgr-workers in a single curve group): https://files.slack.com/files-pri/T1HG3J90S-F081E9RSGMP/download/fio_cyan_3osd_3react_bal_1procs_randread_top_cpu.png
2024-11-15T11:51:20.336Z
<Jose J Palacios-Perez> https://files.slack.com/files-pri/T1HG3J90S-F080M3E9YKZ/download/fio_cyan_3osd_3react_bal_1procs_randread_top_mem.png
2024-11-15T11:53:50.401Z
<Jose J Palacios-Perez> The following are for OSD CPU and MEM util. The former would be interesting to compare against an unbalanced configuration:: https://files.slack.com/files-pri/T1HG3J90S-F0811GULQ83/download/osd_cyan_3osd_3react_bal_1procs_randread_top_cpu.png
2024-11-15T11:53:50.403Z
<Jose J Palacios-Perez> https://files.slack.com/files-pri/T1HG3J90S-F080M3NUA3Z/download/osd_cyan_3osd_3react_bal_1procs_randread_top_mem.png
2024-11-15T13:25:00.675Z
<Brett Niver> interesting, not sure what to make of it yet, but interesting 🙂
2024-11-15T14:43:09.045Z
<Jose J Palacios-Perez> Update: Bill suggested that a better balance for improved performance would be in terms of OSDs, (rather than split the reactor from the same OSD into the available sockets). In other words, all the reactors of the same OSD should run in the same socket:
• OSD0: 3 reactors in the same socket 0: 0-3,
• OSD1: 3 reactors in socket 1: 28-30,
• OSD2: 3 reactors in socket 0: 4-6
What @Matan Breizman, @Yingxin Cheng are your thoughts please?
2024-11-15T14:55:59.832Z
<Jose J Palacios-Perez> This is the acceptance criteria I mentioned above: I've disabled and re-run the tests, will think about reorganising my Python script to attempt the balance per OSD: https://files.slack.com/files-pri/T1HG3J90S-F080VR152TG/download/heuristic-response-curves.png
2024-11-15T15:45:23.938Z
<Jose J Palacios-Perez> This is the acceptance criteria I mentioned above: I've disabled it and re-run the tests shortly, will think about reorganising my Python script to attempt the balance per OSD instead 🤔
2024-11-15T20:00:21.190Z
<> file_delete
2024-11-15T20:01:38.271Z
<Jose J Palacios-Perez> I need to disable the heuristic (stop criteria for the response latency tests): if the latency std deviation is too high, the test does not increase the IO load, hence ending the main loop. Despite that, 4k random read looks pretty good: the OSD CPU utilisation remains constant, the client FIO (using libRBD) becomes the bottleneck: this is a small (3OSD, 3 seastar reactor per OSD) balanced config:

(I found an issue with this chart, showing wrong CPU util, amended .png below)
2024-11-15T20:03:09.996Z
<Jose J Palacios-Perez> Amended chart: OSD CPU corrected:: https://files.slack.com/files-pri/T1HG3J90S-F08117S63A9/download/cyan_3osd_3react_bal_1procs_randread_iops_vs_lat_vs_cpu.png

Any issue? please create an issue here and use the infra label.