For binaries that are statically linked, consecutive stack frames are
likely to be in the same VMA and therefore have the same build id.
As an optimization for this case, we can cache the previous frame's
VMA, if the new frame has the same VMA as the previous one, reuse the
previous one's build id. We are holding the MM locks as reader across
the entire loop, so we don't need to worry about VMA going away.
Tested through "stacktrace_build_id" and "stacktrace_build_id_nmi" in
test_progs.
Suggested-by: Greg Thelen <[email protected]>
Signed-off-by: Hao Luo <[email protected]>
---
kernel/bpf/stackmap.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
index 22c8ae94e4c1..38bdfcd06f55 100644
--- a/kernel/bpf/stackmap.c
+++ b/kernel/bpf/stackmap.c
@@ -132,7 +132,8 @@ static void stack_map_get_build_id_offset(struct bpf_stack_build_id *id_offs,
int i;
struct mmap_unlock_irq_work *work = NULL;
bool irq_work_busy = bpf_mmap_unlock_get_irq_work(&work);
- struct vm_area_struct *vma;
+ struct vm_area_struct *vma, *prev_vma = NULL;
+ const char *prev_build_id;
/* If the irq_work is in use, fall back to report ips. Same
* fallback is used for kernel stack (!user) on a stackmap with
@@ -150,6 +151,12 @@ static void stack_map_get_build_id_offset(struct bpf_stack_build_id *id_offs,
}
for (i = 0; i < trace_nr; i++) {
+ if (range_in_vma(prev_vma, ips[i], ips[i])) {
+ vma = prev_vma;
+ memcpy(id_offs[i].build_id, prev_build_id,
+ BUILD_ID_SIZE_MAX);
+ goto build_id_valid;
+ }
vma = find_vma(current->mm, ips[i]);
if (!vma || build_id_parse(vma, id_offs[i].build_id, NULL)) {
/* per entry fall back to ips */
@@ -158,9 +165,12 @@ static void stack_map_get_build_id_offset(struct bpf_stack_build_id *id_offs,
memset(id_offs[i].build_id, 0, BUILD_ID_SIZE_MAX);
continue;
}
+build_id_valid:
id_offs[i].offset = (vma->vm_pgoff << PAGE_SHIFT) + ips[i]
- vma->vm_start;
id_offs[i].status = BPF_STACK_BUILD_ID_VALID;
+ prev_vma = vma;
+ prev_build_id = id_offs[i].build_id;
}
bpf_mmap_unlock_mm(work, current->mm);
}
--
2.35.1.473.g83b2b277ed-goog
On Wed, Feb 23, 2022 at 4:05 PM Hao Luo <[email protected]> wrote:
>
> For binaries that are statically linked, consecutive stack frames are
> likely to be in the same VMA and therefore have the same build id.
> As an optimization for this case, we can cache the previous frame's
> VMA, if the new frame has the same VMA as the previous one, reuse the
> previous one's build id. We are holding the MM locks as reader across
> the entire loop, so we don't need to worry about VMA going away.
>
> Tested through "stacktrace_build_id" and "stacktrace_build_id_nmi" in
> test_progs.
>
> Suggested-by: Greg Thelen <[email protected]>
> Signed-off-by: Hao Luo <[email protected]>
> ---
LGTM. Can you share performance numbers before and after?
Acked-by: Andrii Nakryiko <[email protected]>
> kernel/bpf/stackmap.c | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
> index 22c8ae94e4c1..38bdfcd06f55 100644
> --- a/kernel/bpf/stackmap.c
> +++ b/kernel/bpf/stackmap.c
> @@ -132,7 +132,8 @@ static void stack_map_get_build_id_offset(struct bpf_stack_build_id *id_offs,
> int i;
> struct mmap_unlock_irq_work *work = NULL;
> bool irq_work_busy = bpf_mmap_unlock_get_irq_work(&work);
> - struct vm_area_struct *vma;
> + struct vm_area_struct *vma, *prev_vma = NULL;
> + const char *prev_build_id;
>
> /* If the irq_work is in use, fall back to report ips. Same
> * fallback is used for kernel stack (!user) on a stackmap with
> @@ -150,6 +151,12 @@ static void stack_map_get_build_id_offset(struct bpf_stack_build_id *id_offs,
> }
>
> for (i = 0; i < trace_nr; i++) {
> + if (range_in_vma(prev_vma, ips[i], ips[i])) {
> + vma = prev_vma;
> + memcpy(id_offs[i].build_id, prev_build_id,
> + BUILD_ID_SIZE_MAX);
> + goto build_id_valid;
> + }
> vma = find_vma(current->mm, ips[i]);
> if (!vma || build_id_parse(vma, id_offs[i].build_id, NULL)) {
> /* per entry fall back to ips */
> @@ -158,9 +165,12 @@ static void stack_map_get_build_id_offset(struct bpf_stack_build_id *id_offs,
> memset(id_offs[i].build_id, 0, BUILD_ID_SIZE_MAX);
> continue;
> }
> +build_id_valid:
> id_offs[i].offset = (vma->vm_pgoff << PAGE_SHIFT) + ips[i]
> - vma->vm_start;
> id_offs[i].status = BPF_STACK_BUILD_ID_VALID;
> + prev_vma = vma;
> + prev_build_id = id_offs[i].build_id;
> }
> bpf_mmap_unlock_mm(work, current->mm);
> }
> --
> 2.35.1.473.g83b2b277ed-goog
>
On Wed, Feb 23, 2022 at 4:11 PM Andrii Nakryiko
<[email protected]> wrote:
>
> On Wed, Feb 23, 2022 at 4:05 PM Hao Luo <[email protected]> wrote:
> >
> > For binaries that are statically linked, consecutive stack frames are
> > likely to be in the same VMA and therefore have the same build id.
> > As an optimization for this case, we can cache the previous frame's
> > VMA, if the new frame has the same VMA as the previous one, reuse the
> > previous one's build id. We are holding the MM locks as reader across
> > the entire loop, so we don't need to worry about VMA going away.
> >
> > Tested through "stacktrace_build_id" and "stacktrace_build_id_nmi" in
> > test_progs.
> >
> > Suggested-by: Greg Thelen <[email protected]>
> > Signed-off-by: Hao Luo <[email protected]>
> > ---
>
> LGTM. Can you share performance numbers before and after?
>
> Acked-by: Andrii Nakryiko <[email protected]>
>
Thanks Andrii.
On a real-world workload, we observed that 66% of cpu cycles in
__bpf_get_stackid() were spent on build_id_parse() and find_vma().
This was before.
We haven't evaluated the performance with this patch yet. This
optimization seems straightforward, so we plan to upstream it first
and then retest.
On Wed, Feb 23, 2022 at 4:33 PM Hao Luo <[email protected]> wrote:
>
> On Wed, Feb 23, 2022 at 4:11 PM Andrii Nakryiko
> <[email protected]> wrote:
> >
> > On Wed, Feb 23, 2022 at 4:05 PM Hao Luo <[email protected]> wrote:
> > >
> > > For binaries that are statically linked, consecutive stack frames are
> > > likely to be in the same VMA and therefore have the same build id.
> > > As an optimization for this case, we can cache the previous frame's
> > > VMA, if the new frame has the same VMA as the previous one, reuse the
> > > previous one's build id. We are holding the MM locks as reader across
> > > the entire loop, so we don't need to worry about VMA going away.
> > >
> > > Tested through "stacktrace_build_id" and "stacktrace_build_id_nmi" in
> > > test_progs.
> > >
> > > Suggested-by: Greg Thelen <[email protected]>
> > > Signed-off-by: Hao Luo <[email protected]>
> > > ---
> >
> > LGTM. Can you share performance numbers before and after?
> >
> > Acked-by: Andrii Nakryiko <[email protected]>
> >
>
> Thanks Andrii.
>
> On a real-world workload, we observed that 66% of cpu cycles in
> __bpf_get_stackid() were spent on build_id_parse() and find_vma().
> This was before.
>
> We haven't evaluated the performance with this patch yet. This
> optimization seems straightforward, so we plan to upstream it first
> and then retest.
Ok, once it lands upstream, I'd really appreciate if you can retest
and update us with numbers. Thanks!
On Wed, Feb 23, 2022 at 5:10 PM Andrii Nakryiko
<[email protected]> wrote:
>
> On Wed, Feb 23, 2022 at 4:33 PM Hao Luo <[email protected]> wrote:
> >
> > On Wed, Feb 23, 2022 at 4:11 PM Andrii Nakryiko
> > <[email protected]> wrote:
> > >
> > > On Wed, Feb 23, 2022 at 4:05 PM Hao Luo <[email protected]> wrote:
> > > >
> > > > For binaries that are statically linked, consecutive stack frames are
> > > > likely to be in the same VMA and therefore have the same build id.
> > > > As an optimization for this case, we can cache the previous frame's
> > > > VMA, if the new frame has the same VMA as the previous one, reuse the
> > > > previous one's build id. We are holding the MM locks as reader across
> > > > the entire loop, so we don't need to worry about VMA going away.
> > > >
> > > > Tested through "stacktrace_build_id" and "stacktrace_build_id_nmi" in
> > > > test_progs.
> > > >
> > > > Suggested-by: Greg Thelen <[email protected]>
> > > > Signed-off-by: Hao Luo <[email protected]>
> > > > ---
> > >
> > > LGTM. Can you share performance numbers before and after?
> > >
> > > Acked-by: Andrii Nakryiko <[email protected]>
> > >
> >
> > Thanks Andrii.
> >
> > On a real-world workload, we observed that 66% of cpu cycles in
> > __bpf_get_stackid() were spent on build_id_parse() and find_vma().
> > This was before.
> >
> > We haven't evaluated the performance with this patch yet. This
> > optimization seems straightforward, so we plan to upstream it first
> > and then retest.
>
> Ok, once it lands upstream, I'd really appreciate if you can retest
> and update us with numbers. Thanks!
Sure, will do that.
On Wed, Feb 23, 2022 at 4:05 PM Hao Luo <[email protected]> wrote:
>
> For binaries that are statically linked, consecutive stack frames are
> likely to be in the same VMA and therefore have the same build id.
> As an optimization for this case, we can cache the previous frame's
> VMA, if the new frame has the same VMA as the previous one, reuse the
> previous one's build id. We are holding the MM locks as reader across
> the entire loop, so we don't need to worry about VMA going away.
>
> Tested through "stacktrace_build_id" and "stacktrace_build_id_nmi" in
> test_progs.
>
> Suggested-by: Greg Thelen <[email protected]>
> Signed-off-by: Hao Luo <[email protected]>
Nice optimization!
Acked-by: Song Liu <[email protected]>
Hello,
On Wed, Feb 23, 2022 at 6:31 PM Song Liu <[email protected]> wrote:
>
> On Wed, Feb 23, 2022 at 4:05 PM Hao Luo <[email protected]> wrote:
> >
> > For binaries that are statically linked, consecutive stack frames are
> > likely to be in the same VMA and therefore have the same build id.
> > As an optimization for this case, we can cache the previous frame's
> > VMA, if the new frame has the same VMA as the previous one, reuse the
> > previous one's build id. We are holding the MM locks as reader across
> > the entire loop, so we don't need to worry about VMA going away.
> >
> > Tested through "stacktrace_build_id" and "stacktrace_build_id_nmi" in
> > test_progs.
> >
> > Suggested-by: Greg Thelen <[email protected]>
> > Signed-off-by: Hao Luo <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Thanks,
Namhyung
On 2/23/22 19:05, Hao Luo wrote:
> For binaries that are statically linked, consecutive stack frames are
> likely to be in the same VMA and therefore have the same build id.
> As an optimization for this case, we can cache the previous frame's
> VMA, if the new frame has the same VMA as the previous one, reuse the
> previous one's build id. We are holding the MM locks as reader across
> the entire loop, so we don't need to worry about VMA going away.
>
> Tested through "stacktrace_build_id" and "stacktrace_build_id_nmi" in
> test_progs.
>
> Suggested-by: Greg Thelen <[email protected]>
> Signed-off-by: Hao Luo <[email protected]>
Reviewed-by: Pasha Tatashin <[email protected]>
Thanks,
Pasha
Hello:
This patch was applied to bpf/bpf-next.git (master)
by Daniel Borkmann <[email protected]>:
On Wed, 23 Feb 2022 16:05:31 -0800 you wrote:
> For binaries that are statically linked, consecutive stack frames are
> likely to be in the same VMA and therefore have the same build id.
> As an optimization for this case, we can cache the previous frame's
> VMA, if the new frame has the same VMA as the previous one, reuse the
> previous one's build id. We are holding the MM locks as reader across
> the entire loop, so we don't need to worry about VMA going away.
>
> [...]
Here is the summary with links:
- [bpf-next,v2] bpf: Cache the last valid build_id.
https://git.kernel.org/bpf/bpf-next/c/ceac059ed4fd
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
On Fri, Feb 25, 2022 at 12:43 PM Pasha Tatashin
<[email protected]> wrote:
>
> On 2/23/22 19:05, Hao Luo wrote:
> > For binaries that are statically linked, consecutive stack frames are
> > likely to be in the same VMA and therefore have the same build id.
> > As an optimization for this case, we can cache the previous frame's
> > VMA, if the new frame has the same VMA as the previous one, reuse the
> > previous one's build id. We are holding the MM locks as reader across
> > the entire loop, so we don't need to worry about VMA going away.
> >
> > Tested through "stacktrace_build_id" and "stacktrace_build_id_nmi" in
> > test_progs.
> >
> > Suggested-by: Greg Thelen <[email protected]>
> > Signed-off-by: Hao Luo <[email protected]>
>
> Reviewed-by: Pasha Tatashin <[email protected]>
>
An update with performance numbers. Thanks to Blake Jones for
collecting the stats:
In a production workload, with BPF probes sampling stack trace, we see
the following changes:
- stack_map_get_build_id_offset() is taking 70% of the time of
__bpf_get_stackid(); it was 80% before.
- find_get_page() and find_vma() together are taking 75% of the time
of stack_map_get_build_id_offset(); it was 83% before.
Note the call chain is
__bpf_get_stackid()
-> stack_map_get_build_id_offset()
-> find_get_page()
-> find_vma()
> Thanks,
> Pasha
On Thu, Mar 10, 2022 at 5:16 PM Hao Luo <[email protected]> wrote:
>
> On Fri, Feb 25, 2022 at 12:43 PM Pasha Tatashin
> <[email protected]> wrote:
> >
> > On 2/23/22 19:05, Hao Luo wrote:
> > > For binaries that are statically linked, consecutive stack frames are
> > > likely to be in the same VMA and therefore have the same build id.
> > > As an optimization for this case, we can cache the previous frame's
> > > VMA, if the new frame has the same VMA as the previous one, reuse the
> > > previous one's build id. We are holding the MM locks as reader across
> > > the entire loop, so we don't need to worry about VMA going away.
> > >
> > > Tested through "stacktrace_build_id" and "stacktrace_build_id_nmi" in
> > > test_progs.
> > >
> > > Suggested-by: Greg Thelen <[email protected]>
> > > Signed-off-by: Hao Luo <[email protected]>
> >
> > Reviewed-by: Pasha Tatashin <[email protected]>
> >
>
> An update with performance numbers. Thanks to Blake Jones for
> collecting the stats:
>
> In a production workload, with BPF probes sampling stack trace, we see
> the following changes:
>
> - stack_map_get_build_id_offset() is taking 70% of the time of
> __bpf_get_stackid(); it was 80% before.
Great, thanks for following up with updated numbers!
>
> - find_get_page() and find_vma() together are taking 75% of the time
> of stack_map_get_build_id_offset(); it was 83% before.
>
> Note the call chain is
>
> __bpf_get_stackid()
> -> stack_map_get_build_id_offset()
> -> find_get_page()
> -> find_vma()
>
> > Thanks,
> > Pasha