2024-06-13T06:33:59.962Z | <Adam Kraitman> It happened while you ran this bulk operation on 8 issues ? |
2024-06-13T06:55:45.371Z | <Leonid Usov> no, unrelated. Happened when I was working with a single issue. |
2024-06-13T06:57:16.459Z | <Leonid Usov> I posted that screenshot immediately when got it plus a few seconds to find the thread. Hopefully, you can find relevant traces in the logs |
2024-06-13T11:22:17.144Z | <Leonid Usov> Can we please configure the `QA Approved` status of the `QA Run` tracker type to be “closing” or “resolving”, i.e. a ticket in that status will be shows as ~~completed~~, and issues that are linked as “blocked by” this ticket will be resolveable. |
2024-06-13T15:03:33.874Z | <Adam Kraitman> Maybe there is some plugin that adds that functionality to redmine I can test it if you find something that is doing it from that list <https://www.redmine.org/plugins?utf8=%E2%9C%93&page=1&sort=&v=5.1> |
2024-06-13T15:18:29.513Z | <Leonid Usov> Hm.. What does the `QA Closed` state mean? Should we move to that after approving a qa run? |
2024-06-13T15:19:46.925Z | <Leonid Usov> OK it’s actually documented. Let me try and see if that state is working and then it’s a better fit |
2024-06-13T15:19:50.891Z | <Leonid Usov> https://files.slack.com/files-pri/T1HG3J90S-F077TA2N31U/download/image.png |
2024-06-13T15:19:52.325Z | <Leonid Usov> <https://tracker.ceph.com/projects/ceph-qa/wiki> |
2024-06-13T15:43:16.981Z | <yuriw> `QA Closed` means what it says "closed" and all PRs were merged |
2024-06-13T15:44:51.272Z | <Leonid Usov> Yes, this worked. We’ll follow suite by closing approved runs |
2024-06-13T15:46:49.201Z | <Leonid Usov> Are there other reasons to have a QA Run in the QA Closed state besides being approved first? |
2024-06-13T15:48:08.361Z | <yuriw> It could be that for whatever reason you decide not to test and "untag" PRs and stop testing this batch, then you'd close the tracker IMO |
2024-06-13T15:48:24.671Z | <yuriw> that's not too often tho |
2024-06-13T15:53:50.880Z | <Leonid Usov> yeah.. so that was the reason I haven’t considered “Closed” before. Without an explicit Rejected state, Closed becomes ambiguous. And there may be value in leaving a run in a final state that records approval |
2024-06-13T16:40:23.466Z | <Patrick Donnelly> @Adam Kraitman @Dan Mick can this be addressed easily? <https://tracker.ceph.com/issues/66337> |
2024-06-13T16:40:35.315Z | <Patrick Donnelly> I'd like to rotate the client.admin key but this (at least) blocks that |
2024-06-13T16:56:47.303Z | <Yuval Lifshitz> i am trying to test my shaman build, which looks all green: <https://shaman.ceph.com/builds/ceph/wip-yuval-64305/7e679576a8082e7a83db4ceb2120950fe445aa4a/>
but i get this error in teuthology:
```teuthology.exceptions.ScheduleFailError: Scheduling yuvalif-2024-06-13_16:55:06-rgw:notifications-wip-yuval-64305-distro-default-smithi failed: Packages for os_type 'centos', flavor default and ceph hash '7e679576a8082e7a83db4ceb2120950fe445aa4a' not found``` |
2024-06-13T16:56:58.873Z | <Yuval Lifshitz> is this related to the centos8 change? |
2024-06-13T16:58:39.939Z | <yuriw> You likely hitting the issue that centos8 was EOL and removed
Try running on c9 only and/or modify tests not to use c8 |
2024-06-13T17:02:47.381Z | <Yuval Lifshitz> all of the related tests are pointing to centos_latest.yaml and ubuntu_latest.yaml |
2024-06-13T17:02:59.447Z | <Yuval Lifshitz> no centos8 there |
2024-06-13T17:03:13.829Z | <yuriw> are you sure those point to c9? |
2024-06-13T17:06:39.783Z | <Yuval Lifshitz> yes. i think this is the case for quite some time |
2024-06-13T17:10:02.364Z | <Yuval Lifshitz> this was done about a year ago: <https://github.com/ceph/ceph/commit/a85f50c24bd478144fa02df39af47944cb7bc33e> |
2024-06-13T17:53:22.292Z | <Adam King> Someone else (forget who) told me to add `--distro centos --distro-version 9` to my teuthology-suite commands and that seems to work. It also doesn't filter the jobs to only centos 9 ones in case you wanted ubuntu jobs as well. |
2024-06-13T18:00:11.441Z | <Dan Mick> it can at least be addressed. fog and signer are not hard; chacra is a little more complex just because restarting it can interrupt builds and we don't have a great interlock for that. |
2024-06-13T18:01:02.114Z | <Dan Mick> I noticed this week that our OCP instance is also using the LRC as backing store for some things, and I'm not at all sure I know what secret(s) it's using. Have you cataloged that at all? |
2024-06-13T18:19:49.904Z | <Zack Cerza> Is your teuthology copy out-of-date? The default CentOS version was updated to 9 a few weeks ago. |
2024-06-13T18:22:15.792Z | <Zack Cerza> This worked for me just now: `teuthology-suite -v --owner zmc -s rgw:notifications -m smithi --priority 9000 -c wip-yuval-64305 --ceph-repo ceph-ci` |
2024-06-13T18:32:40.810Z | <Patrick Donnelly> the three instances noted in the ticket are the places I saw the admin credential being used |
2024-06-13T18:32:55.970Z | <Patrick Donnelly> if we get rid of those three, I can probably see better if anything else is using the admin key even outside of cephfs |
2024-06-13T18:44:11.444Z | <Dan Mick> my teuthology was out of date. make sure you pull the new commits @Yuval Lifshitz |
2024-06-13T18:45:12.303Z | <Dan Mick> yeah, because there are definitely consumers of rbd and rgw too, and I'd be shocked if none of them were using client.admin |
2024-06-13T18:46:39.318Z | <yuriw> I wonder if adding `distro centos` `--distro-version 9` helps, why do we have to use it? It seems like it should not be required |
2024-06-13T18:47:18.230Z | <Dan Mick> agreed. according to Zack's post it's not |
2024-06-13T18:47:24.255Z | <Yuval Lifshitz> thanks! update teuthology fixed the issue |
2024-06-13T18:49:31.551Z | <yuriw> I actually keep teuthology updates automatic via crontab, but sometimes it requires rerunning `bootstrap` and that has to be done manually AFAIK
Maybe we can rerun `bootstrap` on schedule as well 🤷♂️ |
2024-06-13T18:51:12.976Z | <yuriw> what do you do @Zack Cerza to stay kosher? |
2024-06-13T19:31:17.791Z | <Dan Mick> this is a really dumb question, but: where does radosgw store its ceph auth secret? |
2024-06-13T19:32:32.051Z | <Dan Mick> I would have expected a keyring file in /etc/ceph on the rgw host |
2024-06-13T19:33:52.690Z | <Dan Mick> oh. /var/lib/ceph/<instance> for containerized daemons. nm |
2024-06-13T19:38:25.587Z | <Dan Mick> ok, rgw isn't affected; its ceph auth is local to the rgw host (in the LRC). (of course). I think OCP only uses RGW. |
2024-06-13T19:43:33.627Z | <Zack Cerza> I guess when I see something unexpected I check what version I have and update if I can - both for things I work on, and things I don't |
2024-06-13T19:43:43.115Z | <Zack Cerza> I suppose some sort of update notification could be useful here though? |
2024-06-13T20:28:43.129Z | <Dan Mick> @Patrick Donnelly fog didn't seem to have an issue mounting a cephfs without a ceph.conf, but signer is having issues and seems to be demanding a ceph.conf. Did something change in later kernel versions or something? I don't know why it would need it; in both cases the mon addresses are in the fstab line |
2024-06-13T20:29:00.731Z | <Dan Mick> did not load config file, using default settings.
2024-06-13T13:21:40.396-0700 7f852f48bf40 -1 Errors while parsing config file!
2024-06-13T13:21:40.396-0700 7f852f48bf40 -1 can't open ceph.conf: (2) No such file or directory
2024-06-13T13:21:40.396-0700 7f852f48bf40 -1 Errors while parsing config file!
2024-06-13T13:21:40.396-0700 7f852f48bf40 -1 can't open ceph.conf: (2) No such file or directoryunable to get monitor info from DNS SRV with service name: ceph-mon
2024-06-13T13:21:40.400-0700 7f852f48bf40 -1 failed for service _ceph-mon._tcp
2024-06-13T13:21:40.400-0700 7f852f48bf40 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directory |
2024-06-13T20:32:27.278Z | <Dan Mick> oh. those are lies. |
2024-06-13T20:32:37.676Z | <Dan Mick> perhaps the issue is that the cephfs auth doesn't allow access to /signer |
2024-06-13T20:36:56.029Z | <Dan Mick> ok, stuck; I can't find a cephfs named signer |
2024-06-13T20:58:47.552Z | <Patrick Donnelly> soryr was afk |
2024-06-13T20:58:52.701Z | <Patrick Donnelly> you figured out signer ? |
2024-06-13T20:58:55.944Z | <Patrick Donnelly> from the ticket update i just saw |
2024-06-13T20:59:46.900Z | <Dan Mick> yeah. I don't know what the issue was; perhaps a transient failure is the best I've got. I was distracted by 1) the error noise and 2) not being certain that fstab entries using subdirs were actually supported |
2024-06-13T21:00:05.337Z | <Dan Mick> but it's working. I've just restarted chacra and am making sure it's working |
2024-06-13T21:01:02.130Z | <Dan Mick> looks god |
2024-06-13T21:02:20.667Z | <Patrick Donnelly> I'm seeing this new one: |
2024-06-13T21:02:26.827Z | <Patrick Donnelly> https://files.slack.com/files-pri/T1HG3J90S-F0781ST8PK5/download/untitled |
2024-06-13T21:02:38.735Z | <Patrick Donnelly> fyi Dan I'm just using this command: |
2024-06-13T21:02:58.074Z | <Patrick Donnelly> https://files.slack.com/files-pri/T1HG3J90S-F078ELBSGAD/download/untitled |
2024-06-13T21:03:17.797Z | <Patrick Donnelly> then grep `admin` |
2024-06-13T21:03:40.319Z | <Dan Mick> yeah, at one point I had only changed 'secret' and not 'name' in the mount line |
2024-06-13T21:03:49.270Z | <Patrick Donnelly> oh |
2024-06-13T21:04:22.198Z | <Dan Mick> there should be no active session for name=admin from signer now |
2024-06-13T21:04:44.222Z | <Patrick Donnelly> I still see 2 |
2024-06-13T21:04:48.116Z | <Patrick Donnelly> perhaps lazy unmounts? |
2024-06-13T21:04:56.655Z | <Dan Mick> I just used umount |
2024-06-13T21:05:02.124Z | <Patrick Donnelly> O.o |
2024-06-13T21:05:09.620Z | <Patrick Donnelly> well let's give it a little time to see |
2024-06-13T21:05:28.486Z | <Patrick Donnelly> okay so if that's truly fixed then we're pretty close to done |
2024-06-13T21:05:30.279Z | <Dan Mick> hm. "mount | grep ceph" shows four mounts of the same fs |
2024-06-13T21:05:37.191Z | <Dan Mick> wt... |
2024-06-13T21:05:38.302Z | <Patrick Donnelly> overlaid mounts? |
2024-06-13T21:06:18.238Z | <Patrick Donnelly> I have to go afk again but you can use `for v in $(ceph mon stat --format=json-pretty | jq -r '.quorum | map(.name) | .[]'); do ceph tell mon.$v sessions ; done | less` to hunt for more uses of the admin key |
2024-06-13T21:06:20.410Z | <Patrick Donnelly> i see at least 4 more |
2024-06-13T21:06:23.752Z | <Dan Mick> well it beats me, but I umounted 4 times |
2024-06-13T21:06:30.691Z | <Dan Mick> none failed, and now mount shows none |
2024-06-13T21:06:44.543Z | <Dan Mick> now remounted |
2024-06-13T21:07:23.329Z | <Patrick Donnelly> admin sessions appear to be gone |
2024-06-13T21:07:31.538Z | <Patrick Donnelly> so all cephfs mounts with client.admin are gone; yay! |
2024-06-13T21:07:35.981Z | <Dan Mick> wacky. cool |
2024-06-13T21:07:51.661Z | <Patrick Donnelly> but ya, you can use the above command I pasted to hunt for more uses |
2024-06-13T21:08:02.404Z | <Patrick Donnelly> those are mon client sessions so it should capture everything |
2024-06-13T21:08:08.482Z | <Patrick Donnelly> except transient clients (like some crontab job) |
2024-06-13T21:08:26.765Z | <Patrick Donnelly> Thanks Dan! |
2024-06-13T21:08:47.901Z | <Dan Mick> yw |
2024-06-13T21:08:50.897Z | <Dan Mick> yw |
2024-06-13T21:09:45.328Z | <Dan Mick> if and when you do rotate the key, please lmk |
2024-06-13T22:43:46.747Z | <yuriw> Definitely 👍 I am not even sure how to check version.
I just do git pull and bootstrap |