ceph - ceph-devel - 2024-06-18

Timestamp (UTC)Message
2024-06-18T07:28:46.015Z
<Orit Wasserman> Is there a reason that "ceph windows tests" became required for merging PRs?
2024-06-18T07:29:12.382Z
<Orit Wasserman> It is failing because of the setup and enviorment
2024-06-18T07:29:22.481Z
<Orit Wasserman> It is failing because of the setup and environment
2024-06-18T07:31:10.939Z
<Orit Wasserman> ```Setting up ca-certificates-java (20190909ubuntu1.2) ...
head: cannot open '/etc/ssl/certs/java/cacerts' for reading: No such file or directory```
2024-06-18T07:42:50.084Z
<Lucian Petrut> Do you have a PR link? I see that all recent jobs have passed: <https://jenkins.ceph.com/view/Windows/job/ceph-windows-pull-requests/>

New patches often broke Windows support, until the job became mandatory. IMO the job is really stable and all problems were addressed in a timely manner, so the job should remain mandatory.
2024-06-18T07:43:16.756Z
<Orit Wasserman> <https://jenkins.ceph.com/job/ceph-windows-pull-requests/42028/consoleFull#-21121487677dc3600c-28d8-42e2-9653-291a0c275fdf>
2024-06-18T07:43:34.309Z
<Orit Wasserman> <https://github.com/ceph/ceph/pull/54671>
2024-06-18T07:43:55.419Z
<Lucian Petrut> I see that one of the tests failed:

```ceph_test_librbd.TestMigration.Stress2
[2024-06-17T09:53:52.000Z] subunit-trace reports failures: Command failed: type C:\workspace\test_results\subunit.out | subunit-trace
One or more test suites have failed
At C:\workspace\repos\ceph-win32-tests\test_host\run_tests.ps1:28 char:9
+         throw "One or more test suites have failed"
+         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : OperationStopped: (One or more test suites have failed:String) [], RuntimeException
    + FullyQualifiedErrorId : One or more test suites have failed
 
+ EXIT_CODE=1
+ [[ 1 -eq 124 ]]```
2024-06-18T07:45:28.122Z
<Orit Wasserman> make check that runs it is fine
2024-06-18T07:45:46.123Z
<Lucian Petrut> yeah, "make check" doesn't actually run that test
2024-06-18T07:46:01.787Z
<Orit Wasserman> ``` cluster:
    id:     93e9bfb8-3dcf-4d00-ab81-bea41ff3bcad
    health: HEALTH_WARN
            Module 'rbd_support' has failed dependency: No module named 'dateutil'
            1 pool(s) do not have an application enabled
            3 pool(s) have no replicas configured```
2024-06-18T07:46:09.505Z
<Lucian Petrut> that's unrelated
2024-06-18T07:46:34.924Z
<Orit Wasserman> The PR is not to libRBD
2024-06-18T07:46:38.794Z
<Orit Wasserman> it is for the Mon
2024-06-18T07:46:44.279Z
<Orit Wasserman> the error is unrelated
2024-06-18T07:46:58.959Z
<Lucian Petrut> well sure, it could be a flaky test. let's try a recheck
2024-06-18T07:49:10.434Z
<Lucian Petrut> For what is worth, here's the failing test:

<https://jenkins.ceph.com/job/ceph-windows-pull-requests/42028/artifact/artifacts/test_results/out/ceph_test_librbd/ceph_test_librbd_results.log>

```/home/ubuntu/ceph/src/test/librbd/test_Migration.cc:150: Failure
Value of: src_bl.contents_equal(dst_bl)
  Actual: false
Expected: true
[  FAILED  ] TestMigration.Stress2 (32903 ms)```
2024-06-18T07:50:31.425Z
<Lucian Petrut> I've triggered a recheck, let's see how it goes.
2024-06-18T07:53:10.752Z
<Orit Wasserman> This looks like we read false when it is actaully true (can be a wins vs linux issue)
```after prepare snap: snap0, block 41943040~4194304 differs
src block: : 
00000000  59 59 59 59 59 00 00 00  00 00 00 00 00 00 00 00  |YYYYY...........|
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
003ffff0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00400000
dst block: 11e6719843de: 
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
003ffff0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00400000
/home/ubuntu/ceph/src/test/librbd/test_Migration.cc:150: Failure
Value of: src_bl.contents_equal(dst_bl)
  Actual: false
Expected: true```
YYYY = true
2024-06-18T07:53:28.515Z
<Orit Wasserman> Is there a reason that "ceph windows tests" became required for merging PRs?
2024-06-18T12:19:13.703Z
<Zac Dover> <https://github.com/ceph/ceph/pull/58109>
2024-06-18T12:19:20.264Z
<Zac Dover> @Stefan Kooman ^^^
2024-06-18T12:34:48.746Z
<Orit Wasserman> @Lucian Petrut passed thank you!
2024-06-18T12:50:40.747Z
<Stefan Kooman> Looks good to me!
2024-06-18T12:53:24.617Z
<Zac Dover> @Stefan Kooman, when we get the blessing from @rzarzynski, we'll merge and backport. Should this go to Squid, Reef, and Quincy?
2024-06-18T13:28:20.828Z
<Stefan Kooman> Ideally, yes.
2024-06-18T22:44:57.320Z
<kchheda3> anyone seen this before
```Issue Backporting workflow run
Issue Backporting: All jobs have failed```
the time since i have synced my ceph/fork i am continuously getting this error emails
it fails in the create-backports workflow
```Run python3 src/script/backport-create-issue     
WARNING:root:Missing issues will be created in Backport tracker of the relevant Redmine project
ERROR:root:Redmine credentials are required to perform this operation. Please provide either a Redmine key (via ~/.redmine_key or $REDMINE_API_KEY) or a Redmine username and password (via --user and --password). Optionally, one or more issue numbers can be given via positional argument(s). In the absence of positional arguments, the script will loop through all issues in Pending Backport status.```
why is this workflow triggered as cron job and why all of a sudden i start getting these failures

Any issue? please create an issue here and use the infra label.