2024-08-28T07:44:19.491Z | <Venky Shankar> @rzarzynski Could you plan to have a look at <https://tracker.ceph.com/issues/67595> please? |
2024-08-28T07:48:11.173Z | <Henrik Korkuc> is it possible <https://tracker.ceph.com/issues/46845> issue reappeared again? I wasn't able to run OSDs (mons, mgrs worked fine) in IPv6 only environment until I set ms_bind_ipv4 to false. Was using cephadm to deploy |
2024-08-28T09:07:26.580Z | <Rost Khudov> But I still got a problem with 2 tests:
test_metadata_filter_ampq
```botocore.exceptions.ParamValidationError: Parameter validation failed:
Unknown parameter in NotificationConfiguration.TopicConfigurations[0].Filter: "Metadata", must be one of: Key```
and
test_ps_s3_tags_on_master
```botocore.exceptions.ParamValidationError: Parameter validation failed:
Unknown parameter in NotificationConfiguration.TopicConfigurations[0].Filter: "Tags", must be one of: Key```
is there any reason for that? |
2024-08-28T09:08:18.644Z | <Yuval Lifshitz> these test (and other tests as well) are using extensions to the AWS API |
2024-08-28T09:09:28.707Z | <Yuval Lifshitz> there are instructions for that here: <https://github.com/ceph/ceph/tree/main/examples/rgw/boto3#users> |
2024-08-28T09:09:50.088Z | <Yuval Lifshitz> note that this is not a new thing. and unrelated to the localhost change |
2024-08-28T09:11:40.986Z | <Yuval Lifshitz> will add a note about that here: <https://github.com/ceph/ceph/blob/main/src/test/rgw/bucket_notification/README.rst> as well |
2024-08-28T09:12:10.349Z | <Rost Khudov> yes, thank you, because it is not clear when you are just running RGW notification tests |
2024-08-28T09:27:06.587Z | <Yuval Lifshitz> trying to build "squid" and keep hitting this error:
```librados.so: undefined reference to `Message::encode_otel_trace(ceph::buffer::v15_2_0::list&, unsigned long) const'
librados.so: undefined reference to `Message::decode_otel_trace(ceph::buffer::v15_2_0::list::iterator_impl<true>&, bool)'
librados.so: undefined reference to `fmt::v9::vformat[abi:cxx11](fmt::v9::basic_string_view<char>, fmt::v9::basic_format_args<fmt::v9::basic_format_context<fmt::v9::appender, char> >)'```
any idea? |
2024-08-28T09:50:45.595Z | <Rost Khudov> and documentation itself is not really clear; there are no clear steps what and how you have to include this extra file |
2024-08-28T09:51:47.915Z | <Yuval Lifshitz> the doc says:
For the standard client to support these extensions, the: service-2.sdk-extras.json file should be placed under: ~/.aws/models/s3/2006-03-01/ directory. For more information see here. |
2024-08-28T09:52:08.818Z | <Yuval Lifshitz> you copy service-2.sdk-extras.json to ~/.aws/models/s3/2006-03-01/ |
2024-08-28T09:53:12.165Z | <Rost Khudov> I think that `.aws` folder exist only when you install awscli and run configure command, but it doesn't exist if you install boto3 with pip |
2024-08-28T09:53:44.613Z | <Yuval Lifshitz> ok. so please create it. it would work with boto3 as well |
2024-08-28T09:53:52.915Z | <Yuval Lifshitz> will add this to the doc |
2024-08-28T10:21:09.141Z | <Rost Khudov> hmm, looks like just creating `~/.aws/models/s3/2006-03-01/` directory and copying json file there are not working with boto3 by default |
2024-08-28T10:21:52.782Z | <Rost Khudov> but according to this [doc](https://github.com/boto/botocore/blob/develop/botocore/loaders.py#L33) it should |
2024-08-28T10:22:17.717Z | <Yuval Lifshitz> never had issues with that locally or the test machines |
2024-08-28T10:22:46.742Z | <Yuval Lifshitz> make sure you use the same user for running the test |
2024-08-28T10:31:47.912Z | <Rost Khudov> when I copy to botocore/data/... it works
so should work with .aws folder as well
thank you for the help! |
2024-08-28T10:33:42.585Z | <Yuval Lifshitz> interesting. where is this directory? |
2024-08-28T10:35:04.344Z | <Rost Khudov> when you install package with pip it goes to `/usr/local/lib/python{version}/site-packages/`
here is the full path:
**`/usr/local/lib/python3.9/site-packages/botocore/data/s3/2006-03-01/service-2.sdk-extras.json`** |
2024-08-28T10:35:25.059Z | <Yuval Lifshitz> thanks |
2024-08-28T10:36:45.233Z | <Rost Khudov> Can I ask to post link to PR with updated doc here?
Maybe I will be able to add something |
2024-08-28T10:38:16.840Z | <Yuval Lifshitz> i tried to summarize in a tracker: <https://tracker.ceph.com/issues/67768>
feel free to add more info there, and you are very welcome to contribute and create a PR to fix the docs |
2024-08-28T10:40:34.842Z | <Rost Khudov> I think I have in mind what I want to add to doc, will try to create PR today |
2024-08-28T10:59:55.056Z | <Yuval Lifshitz> sounds good. wil lreview once you have that |
2024-08-28T11:19:33.825Z | <Matan Breizman> Hey, can you share the issue? |
2024-08-28T11:38:10.696Z | <Yonatan Zaken> Sure,
This is the output for `sudo ./install-deps.sh` on my WSL.
```dh binary
dh_update_autotools_config
dh_autoreconf
create-stamp debian/debhelper-build-stamp
dh_prep
dh_auto_install --destdir=debian/ceph-build-deps/
dh_install
dh_installdocs
dh_installchangelogs
dh_perl
dh_link
dh_strip_nondeterminism
dh_compress
dh_fixperms
dh_missing
dh_dwz
dh_strip
dh_makeshlibs
dh_shlibdeps
dh_installdeb
dh_gencontrol
dh_md5sums
dh_builddeb
dpkg-deb: error: control directory has bad permissions 777 (must be >=0755 and <=0775)
dh_builddeb: error: dpkg-deb --root-owner-group --build debian/ceph-build-deps .. returned exit code 2
dh_builddeb: error: Aborting due to earlier error
make: *** [debian/rules:3: binary] Error 2
dpkg-buildpackage: error: debian/rules binary subprocess returned exit status 2
Error in the build process: exit status 2
dpkg: error: cannot access archive 'ceph-build-deps_15.2.0-1_amd64.deb': No such file or directory
mk-build-deps: dpkg --unpack failed```
I understood this might be because of umask or fmask values that are set on the mount directory that is used.
Running `umask` i get: 0022
This is the `/etc/wsl.conf` content I currently have:
```[boot]
systemd=true```
|
2024-08-28T13:08:37.103Z | <Matan Breizman> Looks WSL specific, did you try this:
<https://www.reddit.com/r/bashonubuntuonwindows/comments/a7v5d8/problems_with_dpkgdeb_bad_permissions_how_do_i/> |
2024-08-28T13:21:25.813Z | <Yonatan Zaken> I will try and update, thanks Matan |
2024-08-28T15:25:04.568Z | <Casey Bodley> weekly rgw meeting starting soon in [ <https://pad.ceph.com/p/rgw-weekly](https://meet.google.com/mmj-uzzv-qce> ) |
2024-08-28T16:22:07.712Z | <Casey Bodley> rgw jobs on all releases started failing today with `AssertionError: remote [smithi044.front.sepia.ceph.com](http://smithi044.front.sepia.ceph.com) has osd roles, but no osd devices were specified!`, any idea what changed? |
2024-08-28T16:25:10.086Z | <Casey Bodley> ```2024-08-28T16:03:02.534 DEBUG:teuthology.misc:devs=['/dev/vg_nvme/lv_1', '/dev/vg_nvme/lv_2', '/dev/vg_nvme/lv_3', '/dev/vg_nvme/lv_4']
2024-08-28T16:03:02.534 DEBUG:teuthology.orchestra.run.smithi044:> stat /dev/vg_nvme/lv_1
2024-08-28T16:03:02.588 DEBUG:teuthology.orchestra.run:got remote process result: 1
2024-08-28T16:03:02.588 INFO:teuthology.orchestra.run.smithi044.stderr:stat: cannot statx '/dev/vg_nvme/lv_1': No such file or directory
2024-08-28T16:03:02.588 DEBUG:teuthology.misc:get_scratch_devices: /dev/vg_nvme/lv_1 does not exist
2024-08-28T16:03:02.589 INFO:tasks.ceph:osd dev map: {}
2024-08-28T16:03:02.589 ERROR:teuthology.contextutil:Saw exception from nested tasks
Traceback (most recent call last):
File "/home/teuthworker/src/git.ceph.com_teuthology_cf5021f85c4b0bf435c74a1183036b3d19af44b5/teuthology/contextutil.py", line 30, in nested
vars.append(enter())
File "/usr/lib/python3.10/contextlib.py", line 135, in __enter__
return next(self.gen)
File "/home/teuthworker/src/git.ceph.com_ceph-c_b3b2fa5e3c1cddde679d8fca5fc24bc1f25fe87a/qa/tasks/ceph.py", line 676, in cluster
assert roles_to_devs, \
AssertionError: remote [smithi044.front.sepia.ceph.com](http://smithi044.front.sepia.ceph.com) has osd roles, but no osd devices were specified!``` |
2024-08-28T16:37:30.028Z | <Casey Bodley> it's showing up in the [fs suite](https://pulpito.ceph.com/pdonnell-2024-08-28_16:26:41-fs-wip-pdonnell-testing-20240828.032152-debug-distro-default-smithi/) too, cc @Patrick Donnelly @Venky Shankar |
2024-08-28T16:47:08.816Z | <Patrick Donnelly> Yes, very suddenly |
2024-08-28T16:48:40.268Z | <Patrick Donnelly> cc @Zack Cerza |
2024-08-28T16:49:25.495Z | <Patrick Donnelly> I don't see a recent change to teuthology merged |
2024-08-28T17:07:19.518Z | <Dan Mick> Seems more like a change to the job yml would cause this. Not saying there was one |
2024-08-28T17:12:44.882Z | <Casey Bodley> started happening on quincy, squid and main at the same time, so unlikely due to changes in the suite branch |
2024-08-28T17:13:25.680Z | <Zack Cerza> got a link to one handy? |
2024-08-28T17:13:40.949Z | <Casey Bodley> <https://qa-proxy.ceph.com/teuthology/cbodley-2024-08-28_15:52:13-rgw-wip-67554-squid-distro-default-smithi/7878118/teuthology.log> |
2024-08-28T17:14:05.848Z | <Casey Bodley> > ```2024-08-28T16:00:55.653 INFO:teuthology.orchestra.run.smithi105.stderr:stat: cannot statx '/dev/vg_nvme/lv_1': No such file or directory``` |
2024-08-28T17:19:33.892Z | <Zack Cerza> <https://qa-proxy.ceph.com/teuthology/cbodley-2024-08-28_15:52:13-rgw-wip-67554-squid-distro-default-smithi/7878118/ansible.log>
ansible didn't run at all |
2024-08-28T17:33:51.150Z | <Casey Bodley> any idea where that `/etc/ansible/hosts/sepia` file comes from? |
2024-08-28T17:35:24.473Z | <Zack Cerza> yeah, ceph-sepia-secrets.git
the last time the file was changed, was right around when sentry saw the first failure: <https://github.com/ceph/ceph-sepia-secrets/pull/904> |
2024-08-28T17:37:59.741Z | <Zack Cerza> <https://docs.ansible.com/ansible/latest/inventory_guide/intro_inventory.html#inventory-basics-formats-hosts-and-groups>: "Group names should follow the same guidelines as [Creating valid variable names](https://docs.ansible.com/ansible/latest/playbook_guide/playbooks_variables.html#valid-variable-names)."
<https://docs.ansible.com/ansible/latest/playbook_guide/playbooks_variables.html#valid-variable-names>: "A variable name cannot begin with a number"
<https://github.com/ceph/ceph-sepia-secrets/pull/904/files#diff-6b0046333530400164979089341a41bd1f3459ad4f68d8e138e1c39f5fed13ccR734>: "2_jenkins_builders" |
2024-08-28T17:48:19.517Z | <Zack Cerza> I self-merged this, which should resolve the issue |
2024-08-28T17:48:29.612Z | <Zack Cerza> I self-merged this, which should resolve the issue: <https://github.com/ceph/ceph-sepia-secrets/pull/920> |
2024-08-28T17:50:01.263Z | <Casey Bodley> thanks, rescheduling to test |
2024-08-28T18:02:08.019Z | <Casey Bodley> @Zack Cerza still faling, ex [teuthology.log](https://qa-proxy.ceph.com/teuthology/cbodley-2024-08-28_17:49:24-rgw-wip-67554-squid-distro-default-smithi/7878301/teuthology.log) and [ansible.log](https://qa-proxy.ceph.com/teuthology/cbodley-2024-08-28_17:49:24-rgw-wip-67554-squid-distro-default-smithi/7878301/ansible.log) |
2024-08-28T18:04:17.354Z | <Casey Bodley> `jenkins_builders` is listed under `[jenkins_builders:children]` |
2024-08-28T18:15:07.489Z | <Zack Cerza> off |
2024-08-28T18:15:12.565Z | <Zack Cerza> oof |
2024-08-28T18:15:23.389Z | <Zack Cerza> ok, I think I have a fix for that too |
2024-08-28T18:29:23.410Z | <Dan Mick> thanks @Zack Cerza |
2024-08-28T20:33:05.635Z | <Yonatan Zaken> Thanks this worked for me. For wsl users make sure to relaunch your wsl after editing the wsl.conf as in the link above 🙂 |
2024-08-28T20:56:12.541Z | <Samuel Just> Probably need to install grpc-devel package -- I don't think the dependencies were updated |
2024-08-28T20:58:01.976Z | <Frank Filz> I have grpc-devel installed... It's been suggested to build without NVME so I |
2024-08-28T20:58:14.351Z | <Frank Filz> I have grpc-devel installed... It's been suggested to build without NVME so I've done that. |
2024-08-28T21:03:38.198Z | <Samuel Just> Hmm, ok |
2024-08-28T22:53:40.738Z | <Dan Mick> oh **scratch devices**, I remember those |