ceph - sepia - 2024-08-05

Timestamp (UTC)Message
2024-08-05T07:32:32.352Z
<Guillaume Abrioux> is anybody taking a look at this ?
2024-08-05T07:33:29.570Z
<Guillaume Abrioux> @yuriw @Casey Bodley?
2024-08-05T12:51:50.579Z
<Casey Bodley> apparently not, <https://shaman.ceph.com/builds/ceph/reef/> still red
2024-08-05T12:59:40.948Z
<Guillaume Abrioux> yeh... I saw that earlier too ๐Ÿ˜•
2024-08-05T13:00:03.125Z
<Guillaume Abrioux> i have some reef PRs that are blocked because of that ๐Ÿ˜•
2024-08-05T13:56:50.117Z
<yuriw> ๐Ÿ˜ž
2024-08-05T18:16:38.503Z
<Laura Flores> Hey @Patrick Donnelly, re <https://github.com/ceph/ceph/pull/58800/>, I was looking in pulpito to check the rados main suite run (on 101 priority) that should have been scheduled on Sunday, but it doesn't seem like it was scheduled.

The latest run I see is this, which was scheduled on the old priority of 951 according to teuthology-queue:
<https://pulpito.ceph.com/teuthology-2024-07-28_20:00:18-rados-main-distro-default-smithi/>

Did you have to perform any additional step besides merging your PR to have your crontab changes take effect?

cc @Zack Cerza
2024-08-05T18:29:00.310Z
<Kyrylo Shatskyy> @Adam Kraitman thanks, in this case, maybe close the PR would be better to avoid confusions
2024-08-05T19:00:13.950Z
<Josh Durgin> try rebasing now, made it an annotated tag
2024-08-05T20:38:06.084Z
<Patrick Donnelly> I'll take a look
2024-08-05T20:47:49.495Z
<Patrick Donnelly> ```(virtualenv) teuthology@teuthology:~$ export TEUTHOLOGY_SUITE_ARGS="--non-interactive --newest=100 --ceph-repo=<https://git.ceph.com/ceph.git> --suite-repo=<https://git.ceph.com/ceph.git> --machine-type smithi"
(virtualenv) teuthology@teuthology:~$ /home/teuthology/ceph/qa/nightlies/schedule_subset.sh 100000 --ceph main --suite rados -p 101 --dry-run
teuthology-suite --subset=12614/100000 --non-interactive --newest=100 --ceph-repo=<https://git.ceph.com/ceph.git> --suite-repo=<https://git.ceph.com/ceph.git> --machine-type smithi --ceph main --suite rados -p 101 --dry-run
2024-08-05 20:43:16,424.424 INFO:teuthology.suite:Using random seed=1516
2024-08-05 20:43:16,425.425 INFO:teuthology.suite.run:kernel sha1: distro
2024-08-05 20:43:16,851.851 INFO:teuthology.suite.run:ceph sha1: 1afd5b7cf03e185fab489cdd6edc5a6cda9399c7
2024-08-05 20:43:16,851.851 INFO:teuthology.suite.run:skipping ceph package verification
2024-08-05 20:43:16,851.851 INFO:teuthology.suite.run:ceph branch: main 1afd5b7cf03e185fab489cdd6edc5a6cda9399c7
2024-08-05 20:43:16,860.860 INFO:teuthology.repo_utils:Fetching git.ceph.com_ceph_main from origin
2024-08-05 20:43:22,025.025 INFO:teuthology.suite.run:teuthology branch: main 4bc527c722abc094dbc33e04519a3647224076f6
2024-08-05 20:43:22,052.052 INFO:teuthology.suite.build_matrix:Subset=12614/100000
^Bcj2024-08-05 20:44:19,493.493 INFO:teuthology.suite.run:Suite rados in /home/teuthology/src/git.ceph.com_ceph_main/qa/suites/rados generated 336 jobs (not yet filtered or merged)
2024-08-05 20:44:24,291.291 INFO:teuthology.suite.util:Container build incomplete
2024-08-05 20:44:24,293.293 ERROR:teuthology.suite.run:Packages for os_type 'centos', flavor default and ceph hash '1afd5b7cf03e185fab489cdd6edc5a6cda9399c7' not found
2024-08-05 20:44:24,293.293 INFO:teuthology.suite.util:Looking for parent commits: <http://githelper.ceph.com/ceph.git/history?committish=1afd5b7cf03e185fab489cdd6edc5a6cda9399c7&count=101>
2024-08-05 20:44:24,533.533 INFO:teuthology.suite.util:Container build incomplete
2024-08-05 20:44:24,535.535 ERROR:teuthology.suite.run:Packages for os_type 'centos', flavor default and ceph hash '01a9f0854432929967272b9dc57c6726eab000a9' not found
2024-08-05 20:44:25,883.883 ERROR:teuthology.suite.run:Packages for os_type 'ubuntu', flavor default and ceph hash 'cc5533f39fc9a6a7d4d8b3f71694b4ced8fc20e0' not found
2024-08-05 20:44:26,017.017 INFO:teuthology.suite.util:Container build incomplete
2024-08-05 20:44:26,018.018 ERROR:teuthology.suite.run:Packages for os_type 'centos', flavor default and ceph hash '88f123005546b61348260c9df558eb33589acc95' not found
2024-08-05 20:44:26,155.155 INFO:teuthology.suite.util:Container build incomplete
2024-08-05 20:44:26,157.157 ERROR:teuthology.suite.run:Packages for os_type 'centos', flavor default and ceph hash '9c7e72cf93a79e48cf48522b854b62e0685826e4' not found
2024-08-05 20:44:29,251.251 INFO:teuthology.suite.run:--newest supplied, backtracked 5 commits to 93ef537d2bb6d0c4eb0443bb47a842ef98f1e3a1
Traceback (most recent call last):
  File "/home/teuthology/teuthology/virtualenv/bin/teuthology-suite", line 8, in <module>
    sys.exit(main())
  File "/home/teuthology/teuthology/scripts/suite.py", line 226, in main
    return teuthology.suite.main(args)
  File "/home/teuthology/teuthology/teuthology/suite/__init__.py", line 137, in main
    run.prepare_and_schedule()
  File "/home/teuthology/teuthology/teuthology/suite/run.py", line 438, in prepare_and_schedule
    num_jobs = self.schedule_suite()
  File "/home/teuthology/teuthology/teuthology/suite/run.py", line 681, in schedule_suite
    self.check_priority(len(jobs_to_schedule))
  File "/home/teuthology/teuthology/teuthology/suite/run.py", line 551, in check_priority
    util.schedule_fail(msg, dry_run=self.args.dry_run)
  File "/home/teuthology/teuthology/teuthology/suite/util.py", line 77, in schedule_fail
    raise ScheduleFailError(message, name)
teuthology.exceptions.ScheduleFailError: Scheduling failed: Unable to schedule 333 jobs with priority 101.

Use the following testing priority
10 to 49: Tests which are urgent and blocking other important development.
50 to 74: Testing a particular feature/fix with less than 25 jobs and can also be used for urgent release testing.
75 to 99: Tech Leads usually schedule integration tests with this priority to verify pull requests against main.
100 to 149: QE validation of point releases.
150 to 199: Testing a particular feature/fix with less than 100 jobs and results will be available in a day or so.
200 to 1000: Large test runs that can be done over the course of a week.
Note: To force run, use --force-priority```
2024-08-05T20:47:57.678Z
<Patrick Donnelly> @Laura Flores ^ too many jobs
2024-08-05T20:48:13.629Z
<Patrick Donnelly> I think that's the cause?
2024-08-05T20:49:32.846Z
<Patrick Donnelly> not sure why this is a new problem, perhaps that subset got bigger recently so scheduling failed
2024-08-05T20:49:40.871Z
<Patrick Donnelly> solution is to add `--force-priority`
2024-08-05T21:59:00.375Z
<yuriw> @Zack Cerza now I don't seem to be able to schedule anyhhing due to `RuntimeError: Read beyond file size detected, file is corrupted.`
2024-08-05T21:59:19.797Z
<yuriw> I did rerun `bootstrap`
2024-08-05T22:00:06.460Z
<yuriw> https://files.slack.com/files-pri/T1HG3J90S-F07F6DZG031/download/untitled
2024-08-05T22:05:55.740Z
<Dan Mick> did you git pull before running bootstrap
2024-08-05T22:50:05.329Z
<Laura Flores> I see, thank you Patrick!
2024-08-05T22:58:49.037Z
<Laura Flores> Mind approving again @Patrick Donnelly? <https://github.com/ceph/ceph/pull/59030/files>
2024-08-05T22:59:00.664Z
<Laura Flores> Mind approving again @Patrick Donnelly? <https://github.com/ceph/ceph/pull/59030/>
2024-08-05T22:59:04.715Z
<Laura Flores> Mind approving again @Patrick Donnelly? <https://github.com/ceph/ceph/pull/59030/>
2024-08-05T23:33:03.834Z
<yuriw> Yes itโ€™s in my crontab 4 times a day
2024-08-05T23:36:57.960Z
<Dan Mick> ok.  it doesn't look like there were any code changes to address the "Read beyond" error, so neither pull nor bootstrap should have made any difference.  Zack's theory is in <https://tracker.ceph.com/issues/65750>.  I don't know any more, but I know 'latest teuthology' doesn't seem to be indicated
2024-08-05T23:37:46.075Z
<yuriw> I will try later, but thatโ€™s been annoying for a while now

Any issue? please create an issue here and use the infra label.