Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Support libbpf v1 compliant section names #228

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

donaldh
Copy link
Contributor

@donaldh donaldh commented Sep 18, 2022

Submitting this as an RFC to get feedback. If the approach looks sane then I will add tests and docs then resubmit.

Signed-off-by: Donald Hunter [email protected]

Copy link
Member

@tohojo tohojo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the approach is OK, with some comments below. Also, the documentation needs to be updated to match, in particular the protocol description in protocol.org; and we should probably solicit comments from others implementing the protocol (Cc @dave-tucker)

echo "PRODUCTION:=${PRODUCTION}" >>$CONFIG
echo "DYNAMIC_LIBXDP:=${DYNAMIC_LIBXDP}" >>$CONFIG
echo "MAX_DISPATCHER_ACTIONS:=${MAX_DISPATCHER_ACTIONS}" >>$CONFIG
echo "BPF_TARGET:=${BPF_TARGET}" >>$CONFIG
echo "LIBBPF_V1_COMPLIANT:=${LIBBPF_V1_COMPLIANT}" >>$CONFIG
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it makes sense to have a configure knob for this? See below, we can just set the define unconditionally...

#else
#define XDP_METADATA_SECTION XDP_METADATA_SECTION_ORIG
#endif

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the dispatcher we should really just define both sections until we can retire the old one; so no need for this ifdef, just define both and duplicate the line adding dispatcher_version in the dispatcher code.

Copy link
Contributor Author

@donaldh donaldh Sep 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicating the dispatcher_version line results in:

xdp-dispatcher.c:201:52: error: section does not match previous declaration [-Werror,-Wsection]
__uint(dispatcher_version, XDP_DISPATCHER_VERSION) SEC(XDP_METADATA_SECTION_NEW)

So it would also require a different variable name to avoid the collision. That's another protocol change that I guess libxdp can handle, but has consequences for other implementations as well.

But I was thinking that the libxdp and dispatcher version on a system would match and would either be old or new, so only need the dispatcher_version declared in 1 section. It's legacy and new programs that we'd need to be able to read old or new metadata from. Or am I missing something?

@@ -5,7 +5,15 @@

#include <linux/types.h>

#define XDP_METADATA_SECTION "xdp_metadata"
#define XDP_METADATA_SECTION_ORIG "xdp_metadata"
#define XDP_METADATA_SECTION_NEW ".data.xdp_metadata"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use _DATA as suffix instead of _NEW?

#define XDP_RUN_CONFIG_SECTION_ORIG "xdp_run_config"
#define XDP_RUN_CONFIG_SECTION_NEW ".data.xdp_run_config"

#if defined(USE_LIBBPF_V1_SECTION_NAMES)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This define will be a user-facing API (as this header is meant to be included by user programs), so I guess we should document it somewhere. I'm not too fond of the name, but I can't think of anything better off the top of my head...

@@ -38,6 +38,10 @@ ifeq ($(SYSTEM_LIBBPF),y)
DEFINES += -DLIBBPF_DYNAMIC
endif

ifeq ($(LIBBPF_V1_COMPLIANT),y)
DEFINES += -DUSE_LIBBPF_V1_SECTION_NAMES
endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can just set this unconditionally; this will only affect the BPF programs built as part of xdp-tools, and the assumption here is that those will only be used with the same version of the library (which will then understand the new section names straight away)...

@@ -5,7 +5,15 @@

#include <linux/types.h>

#define XDP_METADATA_SECTION "xdp_metadata"
#define XDP_METADATA_SECTION_ORIG "xdp_metadata"
#define XDP_METADATA_SECTION_NEW ".data.xdp_metadata"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surely this doesn't need to be renamed since this is not a program section name - therefore libbpf should ignore it.
Otherwise this is a libbpf bug IMHO.

@donaldh
Copy link
Contributor Author

donaldh commented Sep 23, 2022 via email

@dave-tucker
Copy link

Toke has tried getting libbpf to fix this before:
Perhaps we should try again.

My intention (in Aya) is to iterate over the symbol table to discover programs (since one section can contain multiple progs, and it's the symbol name - C function name - that uniquely identifies a program anyway)... from there it's possible to get the section, and use the section name as a hint to what the program type should be.

Adopting that approach in libbpf would solve the problem since it wouldn't be assuming that any ELF section that matches a magic prefix like "xdp", "cgroup" etc... is a loadable program - which I guess is what is generating the warning here.

That, or perhaps any sections that aren't SHF_EXECINSTR should be ignored (if they aren't anyway) which would prevent this from triggering warnings. See below:

  [ 3] xdp/dispatcher    PROGBITS         0000000000000000  00000250
       00000000000004a0  0000000000000000  AX       0     0     8
  [ 7] xdp_metadata      PROGBITS         0000000000000000  00000748
       0000000000000008  0000000000000000  WA       0     0     8

@tohojo
Copy link
Member

tohojo commented Sep 27, 2022

Adopting that approach in libbpf would solve the problem since it wouldn't be assuming that any ELF section that matches a magic prefix like "xdp", "cgroup" etc... is a loadable program - which I guess is what is generating the warning here.

Looking at the discussion from back then, it seems Andrii would be OK with allowing section names that don't start with a dot, as long as they are marked as "non-allocatable": https://lore.kernel.org/r/CAEf4BzY9WxjBX65sa=8SJh4XGLGfHgxGKciRGiSUMJAxbQWWYg@mail.gmail.com

Sadly I have no idea how to convince the compiler to mark the sections as such, which is why the discussion stalled back then. If any of you have any ideas, maybe taking this route is better (i.e., getting rid of the warning in libbpf)?

@donaldh
Copy link
Contributor Author

donaldh commented Sep 28, 2022

Sadly I have no idea how to convince the compiler to mark the sections as such, which is why the discussion stalled back then. If any of you have any ideas, maybe taking this route is better (i.e., getting rid of the warning in libbpf)?

I have done some initial exploration and it can be done either by ld with a linker script, or later with objcopy. The libxdp build currently does this:

objcopy  --rename-section .data=.rodata,alloc,load,readonly,data,contents xsk_def_xdp_prog.embed.o

It is also possible to do:

objcopy --set-section-flags xdp_metadata=contents xsk_def_xdp_prog.o
$ objdump -h xsk_def_xdp_prog.new.o 

xsk_def_xdp_prog.new.o:     file format elf64-bpfle

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .text         00000000  0000000000000000  0000000000000000  00000040  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  1 xdp           00000030  0000000000000000  0000000000000000  00000040  2**3
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
  2 .maps         00000020  0000000000000000  0000000000000000  00000070  2**3
                  CONTENTS, ALLOC, LOAD, DATA
  3 license       00000004  0000000000000000  0000000000000000  00000090  2**0
                  CONTENTS, ALLOC, LOAD, DATA
  4 .xdp_run_config 00000010  0000000000000000  0000000000000000  00000098  2**3
                  CONTENTS, ALLOC, LOAD, DATA
  5 xdp_metadata  00000008  0000000000000000  0000000000000000  000000a8  2**3
                  CONTENTS

@tohojo
Copy link
Member

tohojo commented Sep 29, 2022

Right, since there's no linking step for BPF object files, I guess it'll have to be objcopy. That still leaves .xdp_run_config, though, since it seemed from that discussion that libbpf still wanted to squat on everything starting with a dot.

I'm actually thinking we could just drop the section entirely? We don't really use the section itself for anything, we just need the DATASEC info to be written into the BTF. And I think that will survive even if we remove the section from the ELF file afterwards? May be worth trying out; we could do that for both the XSK programs and the dispatcher. A bit harder for user programs using the XDP_RUN_CONFIG macro, though; I don't think asking users to add obscure objdump invocations to their build environment is feasible.

Not sure if there's some macro magic that will make the compiler emit BTF without actually putting anything into the section itself in the object file?

@dave-tucker
Copy link

Not sure if there's some macro magic that will make the compiler emit BTF without actually putting anything into the section itself in the object file?

Since I've been looking at LLVM lately I figured that perhaps there would be some __attribute__ invocation that would do the right thing. From what I can see, there is not. Ultimately if there is a need for one, we should add it to LLVM such that we emit a SHT_PROGBITS ELF Section with no flags set, with a name of our choosing. Maybe something like: __attribute__(section("metadata", btf_only))

I'm actually thinking we could just drop the section entirely? We don't really use the section itself for anything, we just need the DATASEC info to be written into the BTF.

Thinking about this... do we actually need a DATASEC section?
Writing the above I did think we may have another alternative... that doesn't require an ELF section... and is macro friendly.

#define __uint(name, val) int (*name)[val]
struct {
	__uint(priority, 10);
	__uint(XDP_PASS, 1);
	__uint(XDP_DROP, 1);
} __attribute__((btf_decl_tag("xdp_metadata"))) xdp_metadata;

for which the BTF is:

[1] STRUCT '(anon)' size=24 vlen=3
        'priority' type_id=3 bits_offset=0
        'XDP_PASS' type_id=7 bits_offset=64
        'XDP_DROP' type_id=7 bits_offset=128
[2] DECL_TAG 'xdp_metadata' type_id=1 component_idx=-1
[3] PTR '(anon)' type_id=5
[4] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED
[5] ARRAY '(anon)' type_id=4 index_type_id=6 nr_elems=10
[6] INT '__ARRAY_SIZE_TYPE__' size=4 bits_offset=0 nr_bits=32 encoding=(none)
[7] PTR '(anon)' type_id=8
[8] ARRAY '(anon)' type_id=4 index_type_id=6 nr_elems=1
[9] VAR 'xdp_metadata' type_id=1, linkage=global
[10] DATASEC '.bss' size=0 vlen=1
        type_id=9 offset=0 size=24 (VAR 'xdp_metadata')

(ok I lied, there is a .bss ELF section, but that's a problem for another day 🤣 )

libxdp can the lookup the DECL_TAG of xdp_metadata or xdp_run_config, find the base type and read k/v's appropriately.

@tohojo
Copy link
Member

tohojo commented Oct 4, 2022

libxdp can the lookup the DECL_TAG of xdp_metadata or xdp_run_config, find the base type and read k/v's appropriately.

Hmm, yeah, this does seem kinda reasonable, however...

(ok I lied, there is a .bss ELF section, but that's a problem for another day rofl )

...this we may have to deal with. Without putting stuff in a separate section, this variable will show up in the .bss section, which will at least be something that's user-visible when people use the skeleton generation to consume those vars. So in that sense it would be better if we could avoid having the actual variables in the data sections an just generate the BTF. This actually goes for the .data.XXX solution as well, I suppose...

Also, what version constraints (on libbpf and/or compiler) do we impose if we start using btf_decl_tag() ?

@dave-tucker
Copy link

libxdp can the lookup the DECL_TAG of xdp_metadata or xdp_run_config, find the base type and read k/v's appropriately.

Hmm, yeah, this does seem kinda reasonable, however...

(ok I lied, there is a .bss ELF section, but that's a problem for another day rofl )

...this we may have to deal with. Without putting stuff in a separate section, this variable will show up in the .bss section, which will at least be something that's user-visible when people use the skeleton generation to consume those vars. So in that sense it would be better if we could avoid having the actual variables in the data sections an just generate the BTF. This actually goes for the .data.XXX solution as well, I suppose...

Marking as const volatile places it in .rodata, which may be less bad?
But agreed it would be nice to hide it completely. Would be worth exploring further if/how we can do that.

Also, what version constraints (on libbpf and/or compiler) do we impose if we start using btf_decl_tag() ?

Clang 15, libbpf 0.6.0 - there are no kernel constraints as libbpf will sanitize BTF before uploading to the kernel.

@tohojo
Copy link
Member

tohojo commented Oct 5, 2022

Marking as const volatile places it in .rodata, which may be less bad? But agreed it would be nice to hide it completely. Would be worth exploring further if/how we can do that.

Yeah, would be cool if we could just add some BTF information without the variable...

Clang 15, libbpf 0.6.0 - there are no kernel constraints as libbpf will sanitize BTF before uploading to the kernel.

Hmm, clang 15 is not even a month old... may be a bit premature to incur a hard dependency on that? :)

@dave-tucker
Copy link

Hmm, clang 15 is not even a month old... may be a bit premature to incur a hard dependency on that? :)

I was wrong, It's in clang 14 too: https://releases.llvm.org/14.0.0/tools/clang/docs/AttributeReference.html#btf-decl-tag

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants