2024-09-18T05:15:01.304Z | <Venky Shankar> ah kk |
2024-09-18T05:16:06.844Z | <Venky Shankar> by default: autoclose=300s and timeout=60s |
2024-09-18T05:16:20.449Z | <Venky Shankar> what did your setting look like? |
2024-09-18T05:26:38.261Z | <Adam D> currently session_autoclose=1500, session_timeout=1200 |
2024-09-18T05:27:50.064Z | <Venky Shankar> kk. any reason for not using the defaults? |
2024-09-18T05:29:33.670Z | <Adam D> we use kdb+, which keeps a lot of files/segments during processing. We are currently using kernel 5.15, because any attempts to update higher end with capabilities problems - I'm currently trying to debug it |
2024-09-18T05:31:18.415Z | <Adam D> the larger session_timeout is mainly caused by this kdb+ software |
2024-09-18T07:47:27.686Z | <Igor Golikov> Hi, I am still fighing to get debug symbols to be displayed in the gdb. I am using the script debug env `ceph-debug-docker.sh` , and it pulls binaries and builds the image. But when I am in the image, the binary I need to check is stripped (in this case `/usr/bin/ceph-mds`)
Here is the info on this binary:
[root@29ed102ec572 ~]# file /bin/ceph-mds
/bin/ceph-mds: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=55337129c0b7e4adef052d4e951c28987529f0d9, for GNU/Linux 3.2.0, **stripped** |
2024-09-18T07:48:18.622Z | <Igor Golikov> What is the correct way to debug core dumps? I feel that I am missing something and can't guess it... |
2024-09-18T07:48:52.673Z | <Venky Shankar> oh, if the binary is stripped then does having debug info packages not show up good stack frames? (cc @Milind Changire) |
2024-09-18T07:49:21.836Z | <Igor Golikov> Thread 27 (LWP 72397):
#0 0x00007fddd808679a in ?? ()
#1 0x0000562871f7dc30 in ?? ()
#2 0x0000562800000189 in ?? ()
#3 0x000056286d5eca40 in ?? ()
#4 0x0000000000000000 in ?? () |
2024-09-18T07:49:31.301Z | <Igor Golikov> yeah doesnt matter what I do this is the output |
2024-09-18T07:49:56.153Z | <Milind Changire> yeah, binary should not be stripped |
2024-09-18T07:50:14.212Z | <Igor Golikov> I even built the SHA that produced the crash on vossi machine and tried to load the binary (which is not stripped) and to use it with provided core dump |
2024-09-18T07:50:37.582Z | <Venky Shankar> Normally, when I build a test branch using ptl-tool, I pass in `--debug-build` that allows nice stack frames to be displayed |
2024-09-18T07:51:00.811Z | <Igor Golikov> what is the ptl tool? i build with ninja |
2024-09-18T07:51:01.218Z | <Venky Shankar> I guess Yuri built a non-debug build. |
2024-09-18T07:51:09.056Z | <Milind Changire> that might not work Igor
its better to work with the exact build RPMs |
2024-09-18T07:51:54.968Z | <Venky Shankar> > what is the ptl tool? i build with ninja
nice little tool to build test build for a set of PRs. Its nicely integrated with redmine to create QA trackers for testing. |
2024-09-18T07:51:57.281Z | <Igor Golikov> @Milind Changire so I dont see any way to debug the dump if the binary was stripped |
2024-09-18T07:52:11.732Z | <Venky Shankar> 😞 |
2024-09-18T07:53:55.932Z | <Igor Golikov> well i will start reading logs... |
2024-09-18T07:54:02.103Z | <Igor Golikov> thanks folks |
2024-09-18T09:54:57.318Z | <Igor Golikov> Do i need to be a part of Ceph org in Github? i |
2024-09-18T09:55:01.626Z | <Igor Golikov> i think i am not right no |
2024-09-18T09:55:02.994Z | <Igor Golikov> now |
2024-09-18T09:58:03.329Z | <Igor Golikov> or, being more specific, if i want to trigger a build for specific SHA, how can i do it ?:) should i access Jenkns directly? |
2024-09-18T10:41:43.638Z | <Venky Shankar> @Igor Golikov you need to push a branch to ceph-ci to get builds |
2024-09-18T10:42:23.161Z | <Venky Shankar> what's the sha? I can check if we have recent (debug) builds? |
2024-09-18T10:42:34.486Z | <Venky Shankar> If not, then you have have them build for you... |
2024-09-18T10:42:51.783Z | <Venky Shankar> If not, then you can have shaman's (ceph-ci) build them for you... |
2024-09-18T10:49:08.648Z | <Igor Golikov> ``` sha1: 26c3fb8e197dcf7a49a54d1f4c8a7362ee35a8ea```
|
2024-09-18T10:49:50.842Z | <Igor Golikov> i will try first to run with latest squid-release, and if it dont crash i will try to find the fix with bisect |
2024-09-18T10:50:15.243Z | <Igor Golikov> some hands on with teuthology, this is the way to learn |
2024-09-18T10:51:16.949Z | <Igor Golikov> `teuthology-lock --lock-many 2 --machine-type smithi` returns that only 1 node can be locked. Can i omit smithi? and lock any type of nodes? |
2024-09-18T10:58:03.809Z | <Venky Shankar> @Igor Golikov smithi's are the only ones that are running tests. There used to be other nodes, but those aren't used,,, |
2024-09-18T10:59:33.113Z | <Venky Shankar> @Igor Golikov <https://shaman.ceph.com/builds/ceph/squid-release/26c3fb8e197dcf7a49a54d1f4c8a7362ee35a8ea/> -- this build is 47 days old... |
2024-09-18T11:00:19.600Z | <Venky Shankar> so, yeh, try latest squid-release or maybe even one of squid branches (<https://shaman.ceph.com/builds/ceph/squid/>) |
2024-09-18T11:26:14.824Z | <Igor Golikov> if the build is available, so the repos are available as well for this one?? |
2024-09-18T11:28:40.413Z | <Igor Golikov> apparently, there are no such repo available in Shaman, thats why I was not able to create debug docker container with squid-release and this SHA |
2024-09-18T11:46:48.056Z | <Venky Shankar> The packages are located in chacras nodes. E.g.: <https://chacra.ceph.com/r/ceph/quincy/b12291d110049b2f35e32e0de30d70e9a4c060d2/centos/8/flavors/default/x86_64/> |
2024-09-18T11:47:21.708Z | <Venky Shankar> Its possible that for the squid release branch (sha) the packages got garbage collected 😕 |
2024-09-18T13:31:53.377Z | <Venky Shankar> That was my home dir eating up space. Moved that to an lvm now. space reclaoimed! |
2024-09-18T13:31:59.472Z | <Venky Shankar> thx @Patrick Donnelly |
2024-09-18T14:01:29.984Z | <gregsfortytwo> Oh if this is actually a squid RC I think it’s built with release flags instead of dev/debug builds. And maybe not in the same place? |
2024-09-18T15:20:15.394Z | <Patrick Donnelly> np |
2024-09-18T19:02:59.622Z | <olesalscheider> I upgraded ubuntu 22.04 to 24.04 which also brought an update of ceph from reef to squid (rc). This is just a test install to play around, so it does not matter too much if data is lost. |
2024-09-18T19:03:36.264Z | <olesalscheider> But since the upgrade ceph-mon is broken and fails to start with the following backtrace: dpaste.com/4QY9VTQKW.txt |
2024-09-18T19:04:08.802Z | <olesalscheider> I tried rc2 from -proposed (https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/2065515) but it still fails. |
2024-09-18T19:04:35.772Z | <olesalscheider> Do you have any idea what might be wrong? |
2024-09-18T19:38:31.725Z | <gregsfortytwo> There was an encoding problem that slipped into a release [https://github.com/ceph/ceph/pull/53340#discussion_r1399255031](https://github.com/ceph/ceph/pull/53340#discussion_r1399255031) but I thought it was cleaned up going forward |
2024-09-18T19:39:53.547Z | <gregsfortytwo> @Patrick Donnelly @Venky Shankar are we going to see a rash of reef->squid upgrade failures on that bal_rank_mask issue? We need a strategy and to communicate about it more clearly if this is an ongoing problem :/ |
2024-09-18T19:41:50.322Z | <olesalscheider> I found that issue. But what does this mean in practise for upgrading? Do I just need a "new enough" or "old enough" version and it will work? Or is something corrupted now which I cannot restore? |
2024-09-18T19:46:04.325Z | <gregsfortytwo> I thought newer versions had code to deal with it, thus my confusion here too :) |
2024-09-18T20:17:56.830Z | <Patrick Donnelly> there was never a release with that encoding bug |
2024-09-18T20:18:03.286Z | <Patrick Donnelly> we caught it before it went out |
2024-09-18T20:44:09.062Z | <gregsfortytwo> So it was just because of the kernel client that we had to muck with it in release versions? |
2024-09-18T20:44:24.574Z | <gregsfortytwo> There’s something else causing a decode issue, then, and I’ve no idea what it would be |
2024-09-18T20:54:46.818Z | <Patrick Donnelly> I see you are installing v19.1.1, perhaps you were running a dev version of reef with the encoding bug? |
2024-09-18T20:54:55.212Z | <Patrick Donnelly> s/running/upgrading from/ |