ceph - crimson - 2024-10-31

Timestamp (UTC)Message
2024-10-31T10:01:55.211Z
<Jose J Palacios-Perez> Hi Matan, thanks for the comments. Yes, we intend to use **both** sockets, the table above is an illustration, the total figures just double, but clarifies the cases  where there is odd number, eg column 3 OSD and third row 👍
2024-10-31T10:08:42.593Z
<Jose J Palacios-Perez> Ah, sorry, there is a typo at the bottom description, it should say "multiply" instead of "divide"
2024-10-31T10:12:23.799Z
<Jose J Palacios-Perez> Ah, sorry, there is a typo at the bottom description, it should say :
• multiple OSD: multiply the number of seastar number of cores by number of OSD.
2024-10-31T10:13:21.442Z
<Jose J Palacios-Perez> Hi Matan, thanks for the comments. Yes, we intend to use **both** sockets, the table above is an illustration, the total figures just double, but clarifies the cases  where there is odd number, eg column 3 OSD and third row 👍 (max 52 hypertheading case)
2024-10-31T10:15:56.562Z
<Jose J Palacios-Perez> Hi @Matan Breizman, some quick questions please: for the target of max number of seastar reactor threads on the system: 2 sockets, 28 physical cores per socket, 56 per socket including hyperthreading: 112 total CPU reported by lscpu.
• Reserve 8 CPUs for the client FIO, so we have two cases:
• physical: that leaves 24 physical cores per socket for both seastar reactors and alien threads,
• hyperthreading: 52 CPU per socket
Here is an example of CPU distribution scenarios. Notice that is an illustration for a single socket, we intend to use both sockets, the figures simply double as appropriate. What are your thoughts please? Thanks in advance
2024-10-31T12:22:36.949Z
<Matan Breizman> Got it.
I think that we can drop the last row since there aren't enough reactor threads there IMO and this will help avoid having too many results. We can add this row later if we find a trend that may be relevant to it. what do you think?

Last thing to consider/specify is the total AlienStore worker threads (`crimson_alien_op_num_threads`) in each test.
For instance for 8 threads per alien core in first row in the physical table:
• 1OSD: 8 alien cores -> 64 alien threads for single OSD.
• 3OSD: 9 alien cores -> 72 alien threads - each OSD should have 72/3 num threads.
• 8OSD: 8 alien cores -> 64 alien threads - each OSD should have 64/8 num threads.
This will look something like this:
```Seastar cores / AlienStore cores (threads)
1*16          / 8                (64)
3*5           / 9                (72)
2*8           / 8                (64)```
2024-10-31T12:24:24.882Z
<Matan Breizman> Got it.
I think that we can drop the last row since there aren't enough reactor threads there IMO and this will help avoid having too many results. We can add this row later if we find a trend that may be relevant to it. what do you think?

Last thing to consider/specify is the total AlienStore worker threads (`crimson_alien_op_num_threads`) in each test.
For instance for 8 threads per alien core in first row in the physical table:
• 1OSD: 8 alien cores -> 64 alien threads for single OSD.
• 3OSD: 9 alien cores -> 72 alien threads - each OSD should have 72/3 num threads.
• 8OSD: 8 alien cores -> 64 alien threads - each OSD should have 64/8 num threads.
This will look something like this:
```Seastar cores / AlienStore cores (threads)
1*16          / 8                (64)
3*5           / 9                (72)
8*2           / 8                (64)```
2024-10-31T12:40:42.632Z
<Jose J Palacios-Perez> > I think that we can drop the last row since there aren't enough reactor
Yes, I agree
2024-10-31T12:49:21.875Z
<Jose J Palacios-Perez> Ok, makes sense, will prepare a latency target type of test plan to traverse the table, will start with the rhs I'm afraid (including hyperthreading) since I need to find out how to indicate which ("physical") CPU cores when configuring the OSD processes for the lhs of the table 👍
2024-10-31T12:50:42.028Z
<Jose J Palacios-Perez> I can run my script to do the pinning though, but was wondering if there was that option already
2024-10-31T14:01:43.054Z
<Matan Breizman> Can we disable hyperthreading?
2024-10-31T14:03:25.687Z
<Matan Breizman> Might be also interesting to get the results incl. hyperthreading. @Yingxin Cheng, do you have a suggestion?
2024-10-31T14:03:43.725Z
<Jose J Palacios-Perez> I was planning to try this: <https://serverfault.com/questions/235825/disable-hyperthreading-from-within-linux-no-access-to-bios>
2024-10-31T15:41:21.956Z
<Jose J Palacios-Perez> I tested the following hint and works on the SV1 nodes:
```NUMA node(s):           2
  NUMA node0 CPU(s):      0-7,16-23
  NUMA node1 CPU(s):      8-15,24-31

echo 0 |sudo tee /sys/devices/system/cpu/cpu{16..23}/online
echo 0 |sudo tee /sys/devices/system/cpu/cpu{24..31}/online
# lscpu
CPU(s):                   32
  On-line CPU(s) list:    0-15
  Off-line CPU(s) list:   16-31
NUMA:
  NUMA node(s):           2
  NUMA node0 CPU(s):      0-7
  NUMA node1 CPU(s):      8-15```
I try that next in the o05 box, hopefully do not need to restart the container 🤞

Any issue? please create an issue here and use the infra label.