2024-04-01 07:32:27

by Andrea Righi

[permalink] [raw]
Subject: [PATCH v2 0/2] libbpf: API to partially consume items from ringbuffer

Introduce ring__consume_max() and ring_buffer__consume_max() API to
partially consume items from one (or more) ringbuffer(s).

This can be useful, for example, to consume just a single item or when
we need to copy multiple items to a limited user-space buffer from the
ringbuffer callback.

Practical example (where this API can be used):
https://github.com/sched-ext/scx/blob/b7c06b9ed9f72cad83c31e39e9c4e2cfd8683a55/rust/scx_rustland_core/src/bpf.rs#L217

See also:
https://lore.kernel.org/lkml/[email protected]/T/#u

v2:
- introduce a new API instead of changing the callback's retcode
behavior

Andrea Righi (2):
libbpf: ringbuf: allow to consume up to a certain amount of items
libbpf: Add ring__consume_max / ring_buffer__consume_max

tools/lib/bpf/libbpf.h | 13 +++++++++++++
tools/lib/bpf/libbpf.map | 2 ++
tools/lib/bpf/ringbuf.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++++-------------


2024-04-01 07:32:49

by Andrea Righi

[permalink] [raw]
Subject: [PATCH 2/2] libbpf: Add ring__consume_max / ring_buffer__consume_max

Introduce a new API to consume items from a ring buffer, limited to a
specified amount, and return to the caller the actual number of items
consumed.

Link: https://lore.kernel.org/lkml/[email protected]/T
Signed-off-by: Andrea Righi <[email protected]>
---
tools/lib/bpf/libbpf.h | 13 +++++++++++++
tools/lib/bpf/libbpf.map | 2 ++
tools/lib/bpf/ringbuf.c | 35 +++++++++++++++++++++++++++++++++--
3 files changed, 48 insertions(+), 2 deletions(-)

diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index f88ab50c0229..85161889efd4 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -1293,6 +1293,8 @@ LIBBPF_API int ring_buffer__add(struct ring_buffer *rb, int map_fd,
ring_buffer_sample_fn sample_cb, void *ctx);
LIBBPF_API int ring_buffer__poll(struct ring_buffer *rb, int timeout_ms);
LIBBPF_API int ring_buffer__consume(struct ring_buffer *rb);
+LIBBPF_API int ring_buffer__consume_max(struct ring_buffer *rb,
+ size_t max_items);
LIBBPF_API int ring_buffer__epoll_fd(const struct ring_buffer *rb);

/**
@@ -1367,6 +1369,17 @@ LIBBPF_API int ring__map_fd(const struct ring *r);
*/
LIBBPF_API int ring__consume(struct ring *r);

+/**
+ * @brief **ring__consume_max()** consumes up to a certain amount of items from
+ * a ringbuffer without event polling.
+ *
+ * @param r A ringbuffer object.
+ * @param max_items Maximum amount of items to consume.
+ * @return The number of items consumed (or max_items, whichever is less), or a
+ * negative number if any of the callbacks return an error.
+ */
+LIBBPF_API int ring__consume_max(struct ring *r, size_t max_items);
+
struct user_ring_buffer_opts {
size_t sz; /* size of this struct, for forward/backward compatibility */
};
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index 51732ecb1385..bb3ed905119d 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -415,4 +415,6 @@ LIBBPF_1.4.0 {
bpf_token_create;
btf__new_split;
btf_ext__raw_data;
+ ring__consume_max;
+ ring_buffer__consume_max;
} LIBBPF_1.3.0;
diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
index 81df535040d1..c8123e326b1a 100644
--- a/tools/lib/bpf/ringbuf.c
+++ b/tools/lib/bpf/ringbuf.c
@@ -281,6 +281,32 @@ static int64_t ringbuf_process_ring(struct ring *r, int64_t max_items)
return cnt;
}

+/* Consume available ring buffer(s) data without event polling up to max_items.
+ *
+ * Returns number of records consumed across all registered ring buffers (or
+ * max_items, whichever is less), or negative number if any of the callbacks
+ * return error.
+ */
+int ring_buffer__consume_max(struct ring_buffer *rb, size_t max_items)
+{
+ int64_t err, res = 0;
+ int i;
+
+ for (i = 0; i < rb->ring_cnt; i++) {
+ struct ring *ring = rb->rings[i];
+
+ err = ringbuf_process_ring(ring, max_items);
+ if (err < 0)
+ return libbpf_err(err);
+ res += err;
+ max_items -= err;
+
+ if (!max_items)
+ break;
+ }
+ return res;
+}
+
/* Consume available ring buffer(s) data without event polling.
* Returns number of records consumed across all registered ring buffers (or
* INT_MAX, whichever is less), or negative number if any of the callbacks
@@ -376,17 +402,22 @@ int ring__map_fd(const struct ring *r)
return r->map_fd;
}

-int ring__consume(struct ring *r)
+int ring__consume_max(struct ring *r, size_t max_items)
{
int64_t res;

- res = ringbuf_process_ring(r, INT_MAX);
+ res = ringbuf_process_ring(r, max_items);
if (res < 0)
return libbpf_err(res);

return res;
}

+int ring__consume(struct ring *r)
+{
+ return ring__consume_max(r, INT_MAX);
+}
+
static void user_ringbuf_unmap_ring(struct user_ring_buffer *rb)
{
if (rb->consumer_pos) {
--
2.43.0


2024-04-01 07:32:50

by Andrea Righi

[permalink] [raw]
Subject: [PATCH 1/2] libbpf: ringbuf: allow to consume up to a certain amount of items

In some cases, instead of always consuming all items from ring buffers
in a greedy way, we may want to consume up to a certain amount of items,
for example when we need to copy items from the BPF ring buffer to a
limited user buffer.

This change allows to set an upper limit to the amount of items consumed
from one or more ring buffers.

Link: https://lore.kernel.org/lkml/[email protected]/T
Signed-off-by: Andrea Righi <[email protected]>
---
tools/lib/bpf/ringbuf.c | 29 +++++++++++++++++------------
1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
index aacb64278a01..81df535040d1 100644
--- a/tools/lib/bpf/ringbuf.c
+++ b/tools/lib/bpf/ringbuf.c
@@ -231,7 +231,7 @@ static inline int roundup_len(__u32 len)
return (len + 7) / 8 * 8;
}

-static int64_t ringbuf_process_ring(struct ring *r)
+static int64_t ringbuf_process_ring(struct ring *r, int64_t max_items)
{
int *len_ptr, len, err;
/* 64-bit to avoid overflow in case of extreme application behavior */
@@ -264,7 +264,14 @@ static int64_t ringbuf_process_ring(struct ring *r)
cons_pos);
return err;
}
- cnt++;
+ if (++cnt >= max_items) {
+ /* update consumer pos and return the
+ * total amount of items consumed.
+ */
+ smp_store_release(r->consumer_pos,
+ cons_pos);
+ goto done;
+ }
}

smp_store_release(r->consumer_pos, cons_pos);
@@ -281,19 +288,18 @@ static int64_t ringbuf_process_ring(struct ring *r)
*/
int ring_buffer__consume(struct ring_buffer *rb)
{
- int64_t err, res = 0;
+ int64_t err, res = 0, max_items = INT_MAX;
int i;

for (i = 0; i < rb->ring_cnt; i++) {
struct ring *ring = rb->rings[i];

- err = ringbuf_process_ring(ring);
+ err = ringbuf_process_ring(ring, max_items);
if (err < 0)
return libbpf_err(err);
res += err;
+ max_items -= err;
}
- if (res > INT_MAX)
- return INT_MAX;
return res;
}

@@ -304,7 +310,7 @@ int ring_buffer__consume(struct ring_buffer *rb)
int ring_buffer__poll(struct ring_buffer *rb, int timeout_ms)
{
int i, cnt;
- int64_t err, res = 0;
+ int64_t err, res = 0, max_items = INT_MAX;

cnt = epoll_wait(rb->epoll_fd, rb->events, rb->ring_cnt, timeout_ms);
if (cnt < 0)
@@ -314,13 +320,12 @@ int ring_buffer__poll(struct ring_buffer *rb, int timeout_ms)
__u32 ring_id = rb->events[i].data.fd;
struct ring *ring = rb->rings[ring_id];

- err = ringbuf_process_ring(ring);
+ err = ringbuf_process_ring(ring, max_items);
if (err < 0)
return libbpf_err(err);
res += err;
+ max_items -= err;
}
- if (res > INT_MAX)
- return INT_MAX;
return res;
}

@@ -375,11 +380,11 @@ int ring__consume(struct ring *r)
{
int64_t res;

- res = ringbuf_process_ring(r);
+ res = ringbuf_process_ring(r, INT_MAX);
if (res < 0)
return libbpf_err(res);

- return res > INT_MAX ? INT_MAX : res;
+ return res;
}

static void user_ringbuf_unmap_ring(struct user_ring_buffer *rb)
--
2.43.0


2024-04-02 17:58:59

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [PATCH 1/2] libbpf: ringbuf: allow to consume up to a certain amount of items

On Mon, Apr 1, 2024 at 12:32 AM Andrea Righi <[email protected]> wrote:
>
> In some cases, instead of always consuming all items from ring buffers
> in a greedy way, we may want to consume up to a certain amount of items,
> for example when we need to copy items from the BPF ring buffer to a
> limited user buffer.
>
> This change allows to set an upper limit to the amount of items consumed
> from one or more ring buffers.
>
> Link: https://lore.kernel.org/lkml/[email protected]/T
> Signed-off-by: Andrea Righi <[email protected]>
> ---
> tools/lib/bpf/ringbuf.c | 29 +++++++++++++++++------------
> 1 file changed, 17 insertions(+), 12 deletions(-)
>
> diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
> index aacb64278a01..81df535040d1 100644
> --- a/tools/lib/bpf/ringbuf.c
> +++ b/tools/lib/bpf/ringbuf.c
> @@ -231,7 +231,7 @@ static inline int roundup_len(__u32 len)
> return (len + 7) / 8 * 8;
> }
>
> -static int64_t ringbuf_process_ring(struct ring *r)
> +static int64_t ringbuf_process_ring(struct ring *r, int64_t max_items)
> {
> int *len_ptr, len, err;
> /* 64-bit to avoid overflow in case of extreme application behavior */
> @@ -264,7 +264,14 @@ static int64_t ringbuf_process_ring(struct ring *r)
> cons_pos);
> return err;
> }
> - cnt++;
> + if (++cnt >= max_items) {
> + /* update consumer pos and return the
> + * total amount of items consumed.
> + */
> + smp_store_release(r->consumer_pos,
> + cons_pos);

Does this fit on a single line under 100 characters? If yes, please
keep it as a single line

but actually it seems cleaner to keep cnt++ intact, let
smp_store_release() below happen, and then check the exit condition.
Were you afraid to do unnecessary checks on discarded samples? I
wouldn't worry about that.

> + goto done;
> + }
> }
>
> smp_store_release(r->consumer_pos, cons_pos);
> @@ -281,19 +288,18 @@ static int64_t ringbuf_process_ring(struct ring *r)
> */
> int ring_buffer__consume(struct ring_buffer *rb)
> {
> - int64_t err, res = 0;
> + int64_t err, res = 0, max_items = INT_MAX;

I'm wondering if it might be better to have a convention that zero
means "no limit", which might allow the compiler to eliminate the
condition in ringbuf_process_ring altogether due to constant
propagation. WDYT?

> int i;
>
> for (i = 0; i < rb->ring_cnt; i++) {
> struct ring *ring = rb->rings[i];
>
> - err = ringbuf_process_ring(ring);
> + err = ringbuf_process_ring(ring, max_items);
> if (err < 0)
> return libbpf_err(err);
> res += err;
> + max_items -= err;
> }
> - if (res > INT_MAX)
> - return INT_MAX;
> return res;
> }

[...]

2024-04-02 18:05:04

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [PATCH 2/2] libbpf: Add ring__consume_max / ring_buffer__consume_max

On Mon, Apr 1, 2024 at 12:32 AM Andrea Righi <[email protected]> wrote:
>
> Introduce a new API to consume items from a ring buffer, limited to a
> specified amount, and return to the caller the actual number of items
> consumed.
>
> Link: https://lore.kernel.org/lkml/[email protected]/T
> Signed-off-by: Andrea Righi <[email protected]>
> ---
> tools/lib/bpf/libbpf.h | 13 +++++++++++++
> tools/lib/bpf/libbpf.map | 2 ++
> tools/lib/bpf/ringbuf.c | 35 +++++++++++++++++++++++++++++++++--
> 3 files changed, 48 insertions(+), 2 deletions(-)
>
> diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
> index f88ab50c0229..85161889efd4 100644
> --- a/tools/lib/bpf/libbpf.h
> +++ b/tools/lib/bpf/libbpf.h
> @@ -1293,6 +1293,8 @@ LIBBPF_API int ring_buffer__add(struct ring_buffer *rb, int map_fd,
> ring_buffer_sample_fn sample_cb, void *ctx);
> LIBBPF_API int ring_buffer__poll(struct ring_buffer *rb, int timeout_ms);
> LIBBPF_API int ring_buffer__consume(struct ring_buffer *rb);
> +LIBBPF_API int ring_buffer__consume_max(struct ring_buffer *rb,
> + size_t max_items);
> LIBBPF_API int ring_buffer__epoll_fd(const struct ring_buffer *rb);
>
> /**
> @@ -1367,6 +1369,17 @@ LIBBPF_API int ring__map_fd(const struct ring *r);
> */
> LIBBPF_API int ring__consume(struct ring *r);
>
> +/**
> + * @brief **ring__consume_max()** consumes up to a certain amount of items from

nit: certain -> requested ?

> + * a ringbuffer without event polling.
> + *
> + * @param r A ringbuffer object.
> + * @param max_items Maximum amount of items to consume.
> + * @return The number of items consumed (or max_items, whichever is less), or a

if we reach max_items, we did consume max_items, so I think "number of
items consumed" is right and I'd drop the part in parenthesis

> + * negative number if any of the callbacks return an error.
> + */
> +LIBBPF_API int ring__consume_max(struct ring *r, size_t max_items);

I'm bikeshedding here, of course, but I prefer `ring__consume_n` and
max_items -> n.

> +
> struct user_ring_buffer_opts {
> size_t sz; /* size of this struct, for forward/backward compatibility */
> };
> diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
> index 51732ecb1385..bb3ed905119d 100644
> --- a/tools/lib/bpf/libbpf.map
> +++ b/tools/lib/bpf/libbpf.map
> @@ -415,4 +415,6 @@ LIBBPF_1.4.0 {
> bpf_token_create;
> btf__new_split;
> btf_ext__raw_data;
> + ring__consume_max;
> + ring_buffer__consume_max;

we are in 1.5 cycle now, please add another section that inherits from
LIBBPF_1.4.0

> } LIBBPF_1.3.0;
> diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
> index 81df535040d1..c8123e326b1a 100644
> --- a/tools/lib/bpf/ringbuf.c
> +++ b/tools/lib/bpf/ringbuf.c
> @@ -281,6 +281,32 @@ static int64_t ringbuf_process_ring(struct ring *r, int64_t max_items)
> return cnt;
> }
>
> +/* Consume available ring buffer(s) data without event polling up to max_items.
> + *
> + * Returns number of records consumed across all registered ring buffers (or
> + * max_items, whichever is less), or negative number if any of the callbacks
> + * return error.
> + */
> +int ring_buffer__consume_max(struct ring_buffer *rb, size_t max_items)
> +{
> + int64_t err, res = 0;
> + int i;
> +
> + for (i = 0; i < rb->ring_cnt; i++) {
> + struct ring *ring = rb->rings[i];
> +
> + err = ringbuf_process_ring(ring, max_items);
> + if (err < 0)
> + return libbpf_err(err);
> + res += err;
> + max_items -= err;
> +
> + if (!max_items)

please use `== 0`, it's not a bool and not a pointer

> + break;
> + }
> + return res;
> +}
> +
> /* Consume available ring buffer(s) data without event polling.
> * Returns number of records consumed across all registered ring buffers (or
> * INT_MAX, whichever is less), or negative number if any of the callbacks
> @@ -376,17 +402,22 @@ int ring__map_fd(const struct ring *r)
> return r->map_fd;
> }
>
> -int ring__consume(struct ring *r)
> +int ring__consume_max(struct ring *r, size_t max_items)
> {
> int64_t res;

hm.. we can probably change this to int, can you do that as part of
your changes?

>
> - res = ringbuf_process_ring(r, INT_MAX);
> + res = ringbuf_process_ring(r, max_items);
> if (res < 0)
> return libbpf_err(res);
>
> return res;
> }
>
> +int ring__consume(struct ring *r)
> +{
> + return ring__consume_max(r, INT_MAX);
> +}
> +
> static void user_ringbuf_unmap_ring(struct user_ring_buffer *rb)
> {
> if (rb->consumer_pos) {
> --
> 2.43.0
>

2024-04-02 20:38:40

by Andrea Righi

[permalink] [raw]
Subject: Re: [PATCH 1/2] libbpf: ringbuf: allow to consume up to a certain amount of items

On Tue, Apr 02, 2024 at 10:58:33AM -0700, Andrii Nakryiko wrote:
> On Mon, Apr 1, 2024 at 12:32 AM Andrea Righi <[email protected]> wrote:
> >
> > In some cases, instead of always consuming all items from ring buffers
> > in a greedy way, we may want to consume up to a certain amount of items,
> > for example when we need to copy items from the BPF ring buffer to a
> > limited user buffer.
> >
> > This change allows to set an upper limit to the amount of items consumed
> > from one or more ring buffers.
> >
> > Link: https://lore.kernel.org/lkml/[email protected]/T
> > Signed-off-by: Andrea Righi <[email protected]>
> > ---
> > tools/lib/bpf/ringbuf.c | 29 +++++++++++++++++------------
> > 1 file changed, 17 insertions(+), 12 deletions(-)
> >
> > diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
> > index aacb64278a01..81df535040d1 100644
> > --- a/tools/lib/bpf/ringbuf.c
> > +++ b/tools/lib/bpf/ringbuf.c
> > @@ -231,7 +231,7 @@ static inline int roundup_len(__u32 len)
> > return (len + 7) / 8 * 8;
> > }
> >
> > -static int64_t ringbuf_process_ring(struct ring *r)
> > +static int64_t ringbuf_process_ring(struct ring *r, int64_t max_items)
> > {
> > int *len_ptr, len, err;
> > /* 64-bit to avoid overflow in case of extreme application behavior */
> > @@ -264,7 +264,14 @@ static int64_t ringbuf_process_ring(struct ring *r)
> > cons_pos);
> > return err;
> > }
> > - cnt++;
> > + if (++cnt >= max_items) {
> > + /* update consumer pos and return the
> > + * total amount of items consumed.
> > + */
> > + smp_store_release(r->consumer_pos,
> > + cons_pos);
>
> Does this fit on a single line under 100 characters? If yes, please
> keep it as a single line
>
> but actually it seems cleaner to keep cnt++ intact, let
> smp_store_release() below happen, and then check the exit condition.
> Were you afraid to do unnecessary checks on discarded samples? I
> wouldn't worry about that.

Ok, it makes sense, I'll change it.

>
> > + goto done;
> > + }
> > }
> >
> > smp_store_release(r->consumer_pos, cons_pos);
> > @@ -281,19 +288,18 @@ static int64_t ringbuf_process_ring(struct ring *r)
> > */
> > int ring_buffer__consume(struct ring_buffer *rb)
> > {
> > - int64_t err, res = 0;
> > + int64_t err, res = 0, max_items = INT_MAX;
>
> I'm wondering if it might be better to have a convention that zero
> means "no limit", which might allow the compiler to eliminate the
> condition in ringbuf_process_ring altogether due to constant
> propagation. WDYT?

Indeed, in this way we won't add any potential overhead to the existing
code that doesn't care about max_items. Will add that.

-Andrea

2024-04-02 20:41:24

by Andrea Righi

[permalink] [raw]
Subject: Re: [PATCH 2/2] libbpf: Add ring__consume_max / ring_buffer__consume_max

On Tue, Apr 02, 2024 at 11:04:39AM -0700, Andrii Nakryiko wrote:
..
> > + * negative number if any of the callbacks return an error.
> > + */
> > +LIBBPF_API int ring__consume_max(struct ring *r, size_t max_items);
>
> I'm bikeshedding here, of course, but I prefer `ring__consume_n` and
> max_items -> n.

I actually like "_n" more than "_max" (same with max_items -> n).

I'll change this (with all the other suggestions) and will send a new
version.

Thanks for the review!
-Andrea