ceph - sepia - 2024-06-18

Timestamp (UTC)Message
2024-06-18T07:22:43.896Z
<Guillaume Abrioux> @Adam Kraitman We have been facing issues with Jenkins workers for some time now. Do you have any ETA for when things will get back to normal..?
2024-06-18T07:29:57.251Z
<Guillaume Abrioux> <https://2.jenkins.ceph.com/job/ceph-ansible-prs-centos-non_container-lvm_batch/3578/console>
2024-06-18T07:30:49.963Z
<Guillaume Abrioux> <https://github.com/ceph/ceph-build/blob/main/ansible/examples/centos8-vagrant.yml> this playbook needs to be updated
2024-06-18T07:31:50.379Z
<Guillaume Abrioux> I'd suggest we use containerized vagrant instead
2024-06-18T07:32:07.705Z
<Guillaume Abrioux> so we can get rid of all this mess <https://github.com/ceph/ceph-build/blob/main/ansible/examples/centos8-vagrant.yml#L64-L80>
2024-06-18T08:22:41.863Z
<Milind Changire> Looks like jenkins lost connectivity to chacra again:
```[urllib3.connectionpool][DEBUG ] Starting new HTTPS connection (1): [3.chacra.ceph.com:443](http://3.chacra.ceph.com:443)
[urllib3.connectionpool][DEBUG ] <https://3.chacra.ceph.com:443> "POST /binaries/ceph/wip-mchangir-DEBUG-fs-volume-ls-failure-by-waiting-for-MDSS_REQUIRED-mds-to-be-active-a05b9bfd73e-debug/a05b9bfd73e61783a4757f3fc916f1541d7ba334/centos/9/x86_64/flavors/default/ HTTP/1.1" 500 141
[chacractl.api.binaries][WARNING] 500 -> <html>
  <head>
    <title>Internal Server Error</title>
  </head>
  <body>
    <h1><p>Internal Server Error</p></h1>
    
  </body>
</html>

[chacractl.util][WARNING] while trying a request, got an exception: 500 Server Error: Internal Server Error for url: <https://3.chacra.ceph.com/binaries/ceph/wip-mchangir-DEBUG-fs-volume-ls-failure-by-waiting-for-MDSS_REQUIRED-mds-to-be-active-a05b9bfd73e-debug/a05b9bfd73e61783a4757f3fc916f1541d7ba334/centos/9/x86_64/flavors/default/>
Traceback (most recent call last):
  File "/tmp/venv.OfbTVQuRHL/bin/chacractl", line 6, in <module>
    main.ChacraCtl()
  File "/tmp/venv.OfbTVQuRHL/lib64/python3.9/site-packages/chacractl/main.py", line 38, in __init__
    self.main(argv)
  File "/tmp/venv.OfbTVQuRHL/lib64/python3.9/site-packages/chacractl/decorators.py", line 68, in newfunc
    return f(*a, **kw)
  File "/tmp/venv.OfbTVQuRHL/lib64/python3.9/site-packages/chacractl/main.py", line 82, in main
    parser.dispatch()
  File "/tmp/venv.OfbTVQuRHL/lib64/python3.9/site-packages/tambo/dispatcher.py", line 21, in dispatch
    result = instance.main()
  File "/tmp/venv.OfbTVQuRHL/lib64/python3.9/site-packages/chacractl/api/binaries.py", line 191, in main
    [self.post](http://self.post)(url, filename)
  File "/tmp/venv.OfbTVQuRHL/lib64/python3.9/site-packages/chacractl/util.py", line 83, in inner_wrapper
    raise final_excep
  File "/tmp/venv.OfbTVQuRHL/lib64/python3.9/site-packages/chacractl/util.py", line 70, in inner_wrapper
    value = function(*args, **kwargs)
  File "/tmp/venv.OfbTVQuRHL/lib64/python3.9/site-packages/chacractl/api/binaries.py", line 122, in post
    response.raise_for_status()
  File "/tmp/venv.OfbTVQuRHL/lib64/python3.9/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: <https://3.chacra.ceph.com/binaries/ceph/wip-mchangir-DEBUG-fs-volume-ls-failure-by-waiting-for-MDSS_REQUIRED-mds-to-be-active-a05b9bfd73e-debug/a05b9bfd73e61783a4757f3fc916f1541d7ba334/centos/9/x86_64/flavors/default/>
Tue Jun 18 07:28:12 AM UTC 2024 :: rm -fr /tmp/install-deps.744878
Build step 'Execute shell' marked build as failure```
2024-06-18T11:21:25.958Z
<Leonid Usov> just navigating..

I hope you have enabled logging for redmine to be able to review the causes of these crashes: https://files.slack.com/files-pri/T1HG3J90S-F078ECW431U/download/image.png
2024-06-18T11:26:44.157Z
<Adam Kraitman> I can't really look at this right now but can you tell me where do you see it I mean what actions are you doing ?
2024-06-18T11:27:01.671Z
<Leonid Usov> at this time I was just browsing to one of the predefined filters
2024-06-18T11:27:27.812Z
<Leonid Usov> every time this happens I do multiple things in short succession. Here, I was just jumping between different issues and queries
2024-06-18T11:27:43.240Z
<Leonid Usov> often I do a bulk update
2024-06-18T11:28:29.832Z
<Leonid Usov> I couldn’t ever find a stable way to reproduce it, but it happens regularly (at least for me) so we have to look at the logs to figure it out
2024-06-18T11:29:08.581Z
<Adam Kraitman> okay maybe I will have some time later and I will look at it
2024-06-18T11:49:56.893Z
<Adam Kraitman> I found some Bots that are trying to DDos tracker and now we are blocking them, tell me if you get this error again
2024-06-18T12:56:08.870Z
<Milind Changire> Looks like jenkins lost connectivity to chacra again:
```[urllib3.connectionpool][DEBUG ] Starting new HTTPS connection (1): 3.chacra.ceph.com:443
[urllib3.connectionpool][DEBUG ] <https://3.chacra.ceph.com:443> "POST /binaries/ceph/wip-mchangir-DEBUG-fs-volume-ls-failure-by-waiting-for-MDSS_REQUIRED-mds-to-be-active-a05b9bfd73e-debug/a05b9bfd73e61783a4757f3fc916f1541d7ba334/centos/9/x86_64/flavors/default/ HTTP/1.1" 500 141
[chacractl.api.binaries][WARNING] 500 -> <html>
  <head>
    <title>Internal Server Error</title>
  </head>
  <body>
    <h1><p>Internal Server Error</p></h1>
    
  </body>
</html>

[chacractl.util][WARNING] while trying a request, got an exception: 500 Server Error: Internal Server Error for url: <https://3.chacra.ceph.com/binaries/ceph/wip-mchangir-DEBUG-fs-volume-ls-failure-by-waiting-for-MDSS_REQUIRED-mds-to-be-active-a05b9bfd73e-debug/a05b9bfd73e61783a4757f3fc916f1541d7ba334/centos/9/x86_64/flavors/default/>
Traceback (most recent call last):
  File "/tmp/venv.OfbTVQuRHL/bin/chacractl", line 6, in <module>
    main.ChacraCtl()
  File "/tmp/venv.OfbTVQuRHL/lib64/python3.9/site-packages/chacractl/main.py", line 38, in __init__
    self.main(argv)
  File "/tmp/venv.OfbTVQuRHL/lib64/python3.9/site-packages/chacractl/decorators.py", line 68, in newfunc
    return f(*a, **kw)
  File "/tmp/venv.OfbTVQuRHL/lib64/python3.9/site-packages/chacractl/main.py", line 82, in main
    parser.dispatch()
  File "/tmp/venv.OfbTVQuRHL/lib64/python3.9/site-packages/tambo/dispatcher.py", line 21, in dispatch
    result = instance.main()
  File "/tmp/venv.OfbTVQuRHL/lib64/python3.9/site-packages/chacractl/api/binaries.py", line 191, in main
    self.post(url, filename)
  File "/tmp/venv.OfbTVQuRHL/lib64/python3.9/site-packages/chacractl/util.py", line 83, in inner_wrapper
    raise final_excep
  File "/tmp/venv.OfbTVQuRHL/lib64/python3.9/site-packages/chacractl/util.py", line 70, in inner_wrapper
    value = function(*args, **kwargs)
  File "/tmp/venv.OfbTVQuRHL/lib64/python3.9/site-packages/chacractl/api/binaries.py", line 122, in post
    response.raise_for_status()
  File "/tmp/venv.OfbTVQuRHL/lib64/python3.9/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: <https://3.chacra.ceph.com/binaries/ceph/wip-mchangir-DEBUG-fs-volume-ls-failure-by-waiting-for-MDSS_REQUIRED-mds-to-be-active-a05b9bfd73e-debug/a05b9bfd73e61783a4757f3fc916f1541d7ba334/centos/9/x86_64/flavors/default/>
Tue Jun 18 07:28:12 AM UTC 2024 :: rm -fr /tmp/install-deps.744878
Build step 'Execute shell' marked build as failure```
or, is this something else ?
2024-06-18T13:39:11.392Z
<Adam Kraitman> Yes, I am trying to migrate it first the next steps will be to look at the spam we are seeing
2024-06-18T14:57:24.871Z
<Adam Kraitman> I am fixing it now
2024-06-18T14:57:51.606Z
<Adam Kraitman> I am fixing the builders now
2024-06-18T14:57:57.297Z
<Guillaume Abrioux> ok..
2024-06-18T18:12:06.359Z
<Zack Cerza> I merged the <https://github.com/ceph/teuthology/pull/1955|ipmi rework PR> yesterday, and the corresponding <https://sentry.ceph.com/organizations/ceph/issues/52903/?project=2|Sentry event> is showing zero power-on failures since then - and Grafana is showing reimage failures sitting at zero as well!: https://files.slack.com/files-pri/T1HG3J90S-F078S3VU1U4/download/screenshot_2024-06-18_at_11.58.21.png
2024-06-18T18:27:42.179Z
<Laura Flores> Nice!
2024-06-18T19:02:22.204Z
<yuriw> builds seem to be very slow <https://shaman.ceph.com/builds/ceph/wip-yuri8-testing-2024-06-18-0723/>
A couple got built and no more
2024-06-18T19:05:49.571Z
<Patrick Donnelly> PSA: client.admin key has been rotated on <https://tracker.ceph.com/issues/66508#change-265988>
2024-06-18T19:05:54.534Z
<Patrick Donnelly> on LRC*
2024-06-18T20:26:51.674Z
<Laura Flores> I was just trying to pull a centos 9 stream image from docker, and found that official centos docker images have been depricated: <https://hub.docker.com/_/centos>
Does anyone know of an alternative?
2024-06-18T20:37:02.575Z
<Laura Flores> I guess they're available on [quay.io](http://quay.io): <https://centos.org/stream9/>

Any issue? please create an issue here and use the infra label.