2021-06-02 23:19:55

by Riccardo Mancini

[permalink] [raw]
Subject: [PATCH] perf ksymbol: fix memory leak: decrease refcount of map and dso

ASan reported a memory leak of BPF-related ksymbols map and dso.
The leak is caused by refcount never reaching 0, due to missing
__put calls in the function machine__process_ksymbol_register.
Once the dso is inserted in map, dso__put should be called,
since map__new2 has increased its refcount to 2.
The same thing applies for the map when it's inserted into the
rb-tree in maps (maps__insert increases the refcount to 2).

$ sudo ./perf record -- sleep 5
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.025 MB perf.data (8 samples) ]

=================================================================
==297735==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 6992 byte(s) in 19 object(s) allocated from:
#0 0x4f43c7 in calloc (/home/user/linux/tools/perf/perf+0x4f43c7)
#1 0x8e4e53 in map__new2 /home/user/linux/tools/perf/util/map.c:216:20
#2 0x8cf68c in machine__process_ksymbol_register /home/user/linux/tools/perf/util/machine.c:778:10
[...]

Indirect leak of 8702 byte(s) in 19 object(s) allocated from:
#0 0x4f43c7 in calloc (/home/user/linux/tools/perf/perf+0x4f43c7)
#1 0x8728d7 in dso__new_id /home/user/linux/tools/perf/util/dso.c:1256:20
#2 0x872015 in dso__new /home/user/linux/tools/perf/util/dso.c:1295:9
#3 0x8cf623 in machine__process_ksymbol_register /home/user/linux/tools/perf/util/machine.c:774:21
[...]

Indirect leak of 1520 byte(s) in 19 object(s) allocated from:
#0 0x4f43c7 in calloc (/home/user/linux/tools/perf/perf+0x4f43c7)
#1 0x87b3da in symbol__new /home/user/linux/tools/perf/util/symbol.c:269:23
#2 0x888954 in map__process_kallsym_symbol /home/user/linux/tools/perf/util/symbol.c:710:8
[...]

Indirect leak of 1406 byte(s) in 19 object(s) allocated from:
#0 0x4f43c7 in calloc (/home/user/linux/tools/perf/perf+0x4f43c7)
#1 0x87b3da in symbol__new /home/user/linux/tools/perf/util/symbol.c:269:23
#2 0x8cfbd8 in machine__process_ksymbol_register /home/user/linux/tools/perf/util/machine.c:803:8
[...]

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/perf/util/machine.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 3ff4936a15a42..d5937778875e1 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -776,6 +776,7 @@ static int machine__process_ksymbol_register(struct machine *machine,
if (dso) {
dso->kernel = DSO_SPACE__KERNEL;
map = map__new2(0, dso);
+ dso__put(dso);
}

if (!dso || !map) {
@@ -792,6 +793,7 @@ static int machine__process_ksymbol_register(struct machine *machine,
map->start = event->ksymbol.addr;
map->end = map->start + event->ksymbol.len;
maps__insert(&machine->kmaps, map);
+ map__put(map);
dso__set_loaded(dso);

if (is_bpf_image(event->ksymbol.name)) {
--
2.31.1


2021-06-04 04:30:12

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH] perf ksymbol: fix memory leak: decrease refcount of map and dso

On Wed, Jun 2, 2021 at 4:15 PM Riccardo Mancini <[email protected]> wrote:
>
> ASan reported a memory leak of BPF-related ksymbols map and dso.
> The leak is caused by refcount never reaching 0, due to missing
> __put calls in the function machine__process_ksymbol_register.
> Once the dso is inserted in map, dso__put should be called,
> since map__new2 has increased its refcount to 2.
> The same thing applies for the map when it's inserted into the
> rb-tree in maps (maps__insert increases the refcount to 2).
>
> $ sudo ./perf record -- sleep 5
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.025 MB perf.data (8 samples) ]
>
> =================================================================
> ==297735==ERROR: LeakSanitizer: detected memory leaks
>
> Direct leak of 6992 byte(s) in 19 object(s) allocated from:
> #0 0x4f43c7 in calloc (/home/user/linux/tools/perf/perf+0x4f43c7)
> #1 0x8e4e53 in map__new2 /home/user/linux/tools/perf/util/map.c:216:20
> #2 0x8cf68c in machine__process_ksymbol_register /home/user/linux/tools/perf/util/machine.c:778:10
> [...]
>
> Indirect leak of 8702 byte(s) in 19 object(s) allocated from:
> #0 0x4f43c7 in calloc (/home/user/linux/tools/perf/perf+0x4f43c7)
> #1 0x8728d7 in dso__new_id /home/user/linux/tools/perf/util/dso.c:1256:20
> #2 0x872015 in dso__new /home/user/linux/tools/perf/util/dso.c:1295:9
> #3 0x8cf623 in machine__process_ksymbol_register /home/user/linux/tools/perf/util/machine.c:774:21
> [...]
>
> Indirect leak of 1520 byte(s) in 19 object(s) allocated from:
> #0 0x4f43c7 in calloc (/home/user/linux/tools/perf/perf+0x4f43c7)
> #1 0x87b3da in symbol__new /home/user/linux/tools/perf/util/symbol.c:269:23
> #2 0x888954 in map__process_kallsym_symbol /home/user/linux/tools/perf/util/symbol.c:710:8
> [...]
>
> Indirect leak of 1406 byte(s) in 19 object(s) allocated from:
> #0 0x4f43c7 in calloc (/home/user/linux/tools/perf/perf+0x4f43c7)
> #1 0x87b3da in symbol__new /home/user/linux/tools/perf/util/symbol.c:269:23
> #2 0x8cfbd8 in machine__process_ksymbol_register /home/user/linux/tools/perf/util/machine.c:803:8
> [...]
>
> Signed-off-by: Riccardo Mancini <[email protected]>
> ---
> tools/perf/util/machine.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> index 3ff4936a15a42..d5937778875e1 100644
> --- a/tools/perf/util/machine.c
> +++ b/tools/perf/util/machine.c
> @@ -776,6 +776,7 @@ static int machine__process_ksymbol_register(struct machine *machine,
> if (dso) {
> dso->kernel = DSO_SPACE__KERNEL;
> map = map__new2(0, dso);
> + dso__put(dso);

Will this cause 2 puts if the map allocation fails? Perhaps this
should be "if (map) dso__put(dso);".

Thanks,
Ian

> }
>
> if (!dso || !map) {
> @@ -792,6 +793,7 @@ static int machine__process_ksymbol_register(struct machine *machine,
> map->start = event->ksymbol.addr;
> map->end = map->start + event->ksymbol.len;
> maps__insert(&machine->kmaps, map);
> + map__put(map);
> dso__set_loaded(dso);
>
> if (is_bpf_image(event->ksymbol.name)) {
> --
> 2.31.1
>

2021-06-04 13:26:36

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH] perf ksymbol: fix memory leak: decrease refcount of map and dso

Em Thu, Jun 03, 2021 at 09:26:40PM -0700, Ian Rogers escreveu:
> On Wed, Jun 2, 2021 at 4:15 PM Riccardo Mancini <[email protected]> wrote:
> > +++ b/tools/perf/util/machine.c
> > @@ -776,6 +776,7 @@ static int machine__process_ksymbol_register(struct machine *machine,
> > if (dso) {
> > dso->kernel = DSO_SPACE__KERNEL;
> > map = map__new2(0, dso);
> > + dso__put(dso);

> Will this cause 2 puts if the map allocation fails? Perhaps this
> should be "if (map) dso__put(dso);".

I think its just a matter of removing the put in the error path, i.e.
the patch becomes what is at the end of this message.

I.e. if map__new2() fails, we want to drop the dso reference, and if it
works, we already have a reference to it, obtained in map__new2().

But looking at this code now I realize that maps__find() should grab a
refcount for the map it returns, because in this
machine__process_ksymbol_register() function we use reference that 'map'
after the if block, i.e. we use it if it came from maps__find() or if we
created it machine__process_ksymbol_register, so there is a possible
race where other thread removes it from the list and map__put()s it
ending up in map__delete() while we still use it in
machine__process_ksymbol_register(), right?

- Arnaldo

> > }

> > if (!dso || !map) {
> > @@ -792,6 +793,7 @@ static int machine__process_ksymbol_register(struct machine *machine,
> > map->start = event->ksymbol.addr;
> > map->end = map->start + event->ksymbol.len;
> > maps__insert(&machine->kmaps, map);
> > + map__put(map);
> > dso__set_loaded(dso);

> > if (is_bpf_image(event->ksymbol.name)) {

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 3ff4936a15a42f74..da19be7da284c250 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -776,10 +776,10 @@ static int machine__process_ksymbol_register(struct machine *machine,
if (dso) {
dso->kernel = DSO_SPACE__KERNEL;
map = map__new2(0, dso);
+ dso__put(dso);
}

if (!dso || !map) {
- dso__put(dso);
return -ENOMEM;
}

@@ -792,6 +792,7 @@ static int machine__process_ksymbol_register(struct machine *machine,
map->start = event->ksymbol.addr;
map->end = map->start + event->ksymbol.len;
maps__insert(&machine->kmaps, map);
+ map__put(map);
dso__set_loaded(dso);

if (is_bpf_image(event->ksymbol.name)) {

2021-06-04 15:19:32

by Riccardo Mancini

[permalink] [raw]
Subject: Re: [PATCH] perf ksymbol: fix memory leak: decrease refcount of map and dso

Hi,

On Fri, 2021-06-04 at 10:22 -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Jun 03, 2021 at 09:26:40PM -0700, Ian Rogers escreveu:
> > On Wed, Jun 2, 2021 at 4:15 PM Riccardo Mancini <[email protected]> wrote:
> > > +++ b/tools/perf/util/machine.c
> > > @@ -776,6 +776,7 @@ static int machine__process_ksymbol_register(struct
> > > machine *machine,
> > >                 if (dso) {
> > >                         dso->kernel = DSO_SPACE__KERNEL;
> > >                         map = map__new2(0, dso);
> > > +                       dso__put(dso);
>
> > Will this cause 2 puts if the map allocation fails? Perhaps this
> > should be "if (map) dso__put(dso);".
>
> I think its just a matter of removing the put in the error path, i.e.
> the patch becomes what is at the end of this message.
>
> I.e. if map__new2() fails, we want to drop the dso reference, and if it
> works, we already have a reference to it, obtained in map__new2().

Agree.
I'm sorry for this stupid oversight.
Should we make it a series including the fix to the issue you pointed out below,
or should I send you a v2 and fix the other issue in a subsequent patch?

> But looking at this code now I realize that maps__find() should grab a
> refcount for the map it returns, because in this
> machine__process_ksymbol_register() function we use reference that 'map'
> after the if block, i.e. we use it if it came from maps__find() or if we
> created it machine__process_ksymbol_register, so there is a possible
> race where other thread removes it from the list and map__put()s it
> ending up in map__delete() while we still use it in
> machine__process_ksymbol_register(), right?

Agree. It should be placed before up_read to avoid races, right?
Then we would need to see where it's called and add the appropriate map__put.

In addition, having a look at other possible concurrency issues in map.c:
- maps__for_each_entry should always be called with either read or write lock,
am I right? It looks like this is not done in certain parts of the code. If such
lock is taken, then grabbing the refcount on the looping variable is not needed
unless we need to return it, right?
- maps__first and map__next do not grab a refcount and neither a lock. If
they're used through a lock-protected loop, it's not a problem, but maybe it's
worth making explicit that they are not to be used directly (through either a
comment or adding some underscores in their names).
- maps__empty: should probably take a reader lock.
- maps__find_symbol: the returned symbol is not protected (the caller does not
receive a refcount to neither map or dso, so if dso is deleted, his reference to
the symbol gets invalidated). Depending on how it's being used it might not be a
problem, but in the general scenario I think it's not thread-safe.

Riccardo


>
> - Arnaldo
>
> > >                 }
>
> > >                 if (!dso || !map) {
> > > @@ -792,6 +793,7 @@ static int machine__process_ksymbol_register(struct
> > > machine *machine,
> > >                 map->start = event->ksymbol.addr;
> > >                 map->end = map->start + event->ksymbol.len;
> > >                 maps__insert(&machine->kmaps, map);
> > > +               map__put(map);
> > >                 dso__set_loaded(dso);
>
> > >                 if (is_bpf_image(event->ksymbol.name)) {
>
> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> index 3ff4936a15a42f74..da19be7da284c250 100644
> --- a/tools/perf/util/machine.c
> +++ b/tools/perf/util/machine.c
> @@ -776,10 +776,10 @@ static int machine__process_ksymbol_register(struct
> machine *machine,
>                 if (dso) {
>                         dso->kernel = DSO_SPACE__KERNEL;
>                         map = map__new2(0, dso);
> +                       dso__put(dso);
>                 }
>  
>                 if (!dso || !map) {
> -                       dso__put(dso);
>                         return -ENOMEM;
>                 }
>  
> @@ -792,6 +792,7 @@ static int machine__process_ksymbol_register(struct
> machine *machine,
>                 map->start = event->ksymbol.addr;
>                 map->end = map->start + event->ksymbol.len;
>                 maps__insert(&machine->kmaps, map);
> +               map__put(map);
>                 dso__set_loaded(dso);
>  
>                 if (is_bpf_image(event->ksymbol.name)) {


2021-06-04 18:31:37

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH] perf ksymbol: fix memory leak: decrease refcount of map and dso

Em Fri, Jun 04, 2021 at 05:16:39PM +0200, Riccardo Mancini escreveu:
> Hi,
>
> On Fri, 2021-06-04 at 10:22 -0300, Arnaldo Carvalho de Melo wrote:
> > Em Thu, Jun 03, 2021 at 09:26:40PM -0700, Ian Rogers escreveu:
> > > On Wed, Jun 2, 2021 at 4:15 PM Riccardo Mancini <[email protected]> wrote:
> > > > +++ b/tools/perf/util/machine.c
> > > > @@ -776,6 +776,7 @@ static int machine__process_ksymbol_register(struct
> > > > machine *machine,
> > > > ??????????????? if (dso) {
> > > > ??????????????????????? dso->kernel = DSO_SPACE__KERNEL;
> > > > ??????????????????????? map = map__new2(0, dso);
> > > > +?????????????????????? dso__put(dso);
> >
> > > Will this cause 2 puts if the map allocation fails? Perhaps this
> > > should be "if (map) dso__put(dso);".
> >
> > I think its just a matter of removing the put in the error path, i.e.
> > the patch becomes what is at the end of this message.
> >
> > I.e. if map__new2() fails, we want to drop the dso reference, and if it
> > works, we already have a reference to it, obtained in map__new2().
>
> Agree.
> I'm sorry for this stupid oversight.
> Should we make it a series including the fix to the issue you pointed out below,
> or should I send you a v2 and fix the other issue in a subsequent patch?

Please send a v2 patch, and then consider starting a new series with the
issues below.

> > But looking at this code now I realize that maps__find() should grab a
> > refcount for the map it returns, because in this
> > machine__process_ksymbol_register() function we use reference that 'map'
> > after the if block, i.e. we use it if it came from maps__find() or if we
> > created it machine__process_ksymbol_register, so there is a possible
> > race where other thread removes it from the list and map__put()s it
> > ending up in map__delete() while we still use it in
> > machine__process_ksymbol_register(), right?
>
> Agree. It should be placed before up_read to avoid races, right?

Yes, we have to grab a refcount while we are sure its not going away,
then return that as the lookup result, whoever receives that refcounted
entry should use it and then drop the refcount.

> Then we would need to see where it's called and add the appropriate map__put.

yes

> In addition, having a look at other possible concurrency issues in map.c:

Its good to have new eyes looking at this, exactly at a time we're
discussing further parallelizing perf :-)

> - maps__for_each_entry should always be called with either read or write lock,
> am I right? It looks like this is not done in certain parts of the code. If such

Right.

> lock is taken, then grabbing the refcount on the looping variable is not needed
> unless we need to return it, right?

Right, returning an entry needs to take a refcount.

> - maps__first and map__next do not grab a refcount and neither a lock. If
> they're used through a lock-protected loop, it's not a problem, but maybe it's

yes

> worth making explicit that they are not to be used directly (through either a
> comment or adding some underscores in their names).

yes, __ in front means, in kernel style, that it does less than the non
__ prefixed, same name, function.

> - maps__empty: should probably take a reader lock.

Indeed.

> - maps__find_symbol: the returned symbol is not protected (the caller does not
> receive a refcount to neither map or dso, so if dso is deleted, his reference to
> the symbol gets invalidated). Depending on how it's being used it might not be a
> problem, but in the general scenario I think it's not thread-safe.

Yes, that function is also problematic.

Thanks for looking into this, please consider sending patches for these
issues,

- Arnaldo

2021-06-12 17:49:31

by Riccardo Mancini

[permalink] [raw]
Subject: [PATCH v2] perf ksymbol: fix memory leak: decrease refcount of map and dso

ASan reported a memory leak of BPF-related ksymbols
map and dso. The leak is caused by refount never
reaching 0, due to missing __put calls in the function
machine__process_ksymbol_register.
Once the dso is inserted in the map, dso__put should be
called (map__new2 increases the refcount to 2).
The same thing applies for the map when it's inserted
into maps (maps__insert increases the refcount to 2).

$ sudo ./perf record -- sleep 5
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.025 MB perf.data (8 samples) ]

=================================================================
==297735==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 6992 byte(s) in 19 object(s) allocated from:
#0 0x4f43c7 in calloc (/home/user/linux/tools/perf/perf+0x4f43c7)
#1 0x8e4e53 in map__new2 /home/user/linux/tools/perf/util/map.c:216:20
#2 0x8cf68c in machine__process_ksymbol_register /home/user/linux/tools/perf/util/machine.c:778:10
[...]

Indirect leak of 8702 byte(s) in 19 object(s) allocated from:
#0 0x4f43c7 in calloc (/home/user/linux/tools/perf/perf+0x4f43c7)
#1 0x8728d7 in dso__new_id /home/user/linux/tools/perf/util/dso.c:1256:20
#2 0x872015 in dso__new /home/user/linux/tools/perf/util/dso.c:1295:9
#3 0x8cf623 in machine__process_ksymbol_register /home/user/linux/tools/perf/util/machine.c:774:21
[...]

Indirect leak of 1520 byte(s) in 19 object(s) allocated from:
#0 0x4f43c7 in calloc (/home/user/linux/tools/perf/perf+0x4f43c7)
#1 0x87b3da in symbol__new /home/user/linux/tools/perf/util/symbol.c:269:23
#2 0x888954 in map__process_kallsym_symbol /home/user/linux/tools/perf/util/symbol.c:710:8
[...]

Indirect leak of 1406 byte(s) in 19 object(s) allocated from:
#0 0x4f43c7 in calloc (/home/user/linux/tools/perf/perf+0x4f43c7)
#1 0x87b3da in symbol__new /home/user/linux/tools/perf/util/symbol.c:269:23
#2 0x8cfbd8 in machine__process_ksymbol_register /home/user/linux/tools/perf/util/machine.c:803:8
[...]

Signed-off-by: Riccardo Mancini <[email protected]>
---
tools/perf/util/machine.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 3ff4936a15a4..da19be7da284 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -776,10 +776,10 @@ static int machine__process_ksymbol_register(struct machine *machine,
if (dso) {
dso->kernel = DSO_SPACE__KERNEL;
map = map__new2(0, dso);
+ dso__put(dso);
}

if (!dso || !map) {
- dso__put(dso);
return -ENOMEM;
}

@@ -792,6 +792,7 @@ static int machine__process_ksymbol_register(struct machine *machine,
map->start = event->ksymbol.addr;
map->end = map->start + event->ksymbol.len;
maps__insert(&machine->kmaps, map);
+ map__put(map);
dso__set_loaded(dso);

if (is_bpf_image(event->ksymbol.name)) {
--
2.23.0

2021-06-17 03:10:27

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v2] perf ksymbol: fix memory leak: decrease refcount of map and dso

Em Sat, Jun 12, 2021 at 07:37:48PM +0200, Riccardo Mancini escreveu:
> ASan reported a memory leak of BPF-related ksymbols
> map and dso. The leak is caused by refount never
> reaching 0, due to missing __put calls in the function
> machine__process_ksymbol_register.
> Once the dso is inserted in the map, dso__put should be
> called (map__new2 increases the refcount to 2).
> The same thing applies for the map when it's inserted
> into maps (maps__insert increases the refcount to 2).

Thanks, applied.

- Arnaldo


> $ sudo ./perf record -- sleep 5
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.025 MB perf.data (8 samples) ]
>
> =================================================================
> ==297735==ERROR: LeakSanitizer: detected memory leaks
>
> Direct leak of 6992 byte(s) in 19 object(s) allocated from:
> #0 0x4f43c7 in calloc (/home/user/linux/tools/perf/perf+0x4f43c7)
> #1 0x8e4e53 in map__new2 /home/user/linux/tools/perf/util/map.c:216:20
> #2 0x8cf68c in machine__process_ksymbol_register /home/user/linux/tools/perf/util/machine.c:778:10
> [...]
>
> Indirect leak of 8702 byte(s) in 19 object(s) allocated from:
> #0 0x4f43c7 in calloc (/home/user/linux/tools/perf/perf+0x4f43c7)
> #1 0x8728d7 in dso__new_id /home/user/linux/tools/perf/util/dso.c:1256:20
> #2 0x872015 in dso__new /home/user/linux/tools/perf/util/dso.c:1295:9
> #3 0x8cf623 in machine__process_ksymbol_register /home/user/linux/tools/perf/util/machine.c:774:21
> [...]
>
> Indirect leak of 1520 byte(s) in 19 object(s) allocated from:
> #0 0x4f43c7 in calloc (/home/user/linux/tools/perf/perf+0x4f43c7)
> #1 0x87b3da in symbol__new /home/user/linux/tools/perf/util/symbol.c:269:23
> #2 0x888954 in map__process_kallsym_symbol /home/user/linux/tools/perf/util/symbol.c:710:8
> [...]
>
> Indirect leak of 1406 byte(s) in 19 object(s) allocated from:
> #0 0x4f43c7 in calloc (/home/user/linux/tools/perf/perf+0x4f43c7)
> #1 0x87b3da in symbol__new /home/user/linux/tools/perf/util/symbol.c:269:23
> #2 0x8cfbd8 in machine__process_ksymbol_register /home/user/linux/tools/perf/util/machine.c:803:8
> [...]
>
> Signed-off-by: Riccardo Mancini <[email protected]>
> ---
> tools/perf/util/machine.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> index 3ff4936a15a4..da19be7da284 100644
> --- a/tools/perf/util/machine.c
> +++ b/tools/perf/util/machine.c
> @@ -776,10 +776,10 @@ static int machine__process_ksymbol_register(struct machine *machine,
> if (dso) {
> dso->kernel = DSO_SPACE__KERNEL;
> map = map__new2(0, dso);
> + dso__put(dso);
> }
>
> if (!dso || !map) {
> - dso__put(dso);
> return -ENOMEM;
> }
>
> @@ -792,6 +792,7 @@ static int machine__process_ksymbol_register(struct machine *machine,
> map->start = event->ksymbol.addr;
> map->end = map->start + event->ksymbol.len;
> maps__insert(&machine->kmaps, map);
> + map__put(map);
> dso__set_loaded(dso);
>
> if (is_bpf_image(event->ksymbol.name)) {
> --
> 2.23.0
>

--

- Arnaldo

2021-06-18 10:06:01

by Riccardo Mancini

[permalink] [raw]
Subject: Re: [PATCH] perf ksymbol: fix memory leak: decrease refcount of map and dso

Hi,

On Fri, 2021-06-04 at 15:29 -0300, Arnaldo Carvalho de Melo wrote:
<SNIP> 
> > > But looking at this code now I realize that maps__find() should grab a
> > > refcount for the map it returns, because in this
> > > machine__process_ksymbol_register() function we use reference that 'map'
> > > after the if block, i.e. we use it if it came from maps__find() or if we
> > > created it machine__process_ksymbol_register, so there is a possible
> > > race where other thread removes it from the list and map__put()s it
> > > ending up in map__delete() while we still use it in
> > > machine__process_ksymbol_register(), right?
> >
> > Agree. It should be placed before up_read to avoid races, right?
>
> Yes, we have to grab a refcount while we are sure its not going away,
> then return that as the lookup result, whoever receives that refcounted
> entry should use it and then drop the refcount.
>
> > Then we would need to see where it's called and add the appropriate
> > map__put.
>
> yes

This function has quite a number of callers (direct and indirect) so the the
patch is becoming huge.

One of these callers is thread__find_map, which returns an addr_location
(actually it's an output argument). This addr_location holds references to map,
maps and thread without getting any refcnt (actually in one function it gets it
on the thread and a comment tells to put it once done). If I'm not wrong, this
addr_location is never malloced (always a local variable) and, is should be
present in parts of the code where there should be a refcnt on the thread.
Therefore, maybe it does not get the refcnts since it assumes that thread (upon
which depends maps and as a consequence map) is always refcnted in its context.
However, I think that it should get all refcnts anyways for clarity and to
prevent possible misuses (if I understood correctly, Ian is of the same
opinion).

My solution would be to add the refcnt grabbing for map, maps and thread in
thread__find_map, releasing them in addr_location__put, and then making sure all
callers call it when no longer in use.

Following the same reasoning, I added refcnt grabbing also to mem_info,
branch_info (map was already refcnted, I added it also to maps for coherency),
map_symbol (as in branch_info, I added it to maps), and in other places in which
I saw a pointer was passed without refcounting.

Most changes are quite trivial, however, the changelog is huge:
48 files changed, 472 insertions(+), 157 deletions(-)
Most of them are just returns converted to goto for calling the __put functions.

Doing so, I managed to remove memory leaks caused by refcounting also in perf-
report (I wanted to try also perf top but I encountered another memory-related
issue). However, the changelog is huge and testing all of it is challenging
(especially since I can test missing puts only with ASan's LeakSanitizer and its
reports are usually full of leaks, which I am trying to fix along the way, I
will send some patches in the following days). How would you go about it? Do you
have any suggestions?

>  
> > In addition, having a look at other possible concurrency issues in map.c:
>
> Its good to have new eyes looking at this, exactly at a time we're
> discussing further parallelizing perf :-)
>
> >  - maps__for_each_entry should always be called with either read or write
> > lock,
> > am I right? It looks like this is not done in certain parts of the code. If
> > such
>
> Right.
>
> > lock is taken, then grabbing the refcount on the looping variable is not
> > needed
> > unless we need to return it, right?
>
> Right, returning an entry needs to take a refcount.
>
> >  - maps__first and map__next do not grab a refcount and neither a lock. If
> > they're used through a lock-protected loop, it's not a problem, but maybe
> > it's
>
> yes
>
> > worth making explicit that they are not to be used directly (through either
> > a
> > comment or adding some underscores in their names).
>
> yes, __ in front means, in kernel style, that it does less than the non
> __ prefixed, same name, function.
>
> >  - maps__empty: should probably take a reader lock.
>
> Indeed.
>
> >  - maps__find_symbol: the returned symbol is not protected (the caller does
> > not
> > receive a refcount to neither map or dso, so if dso is deleted, his
> > reference to
> > the symbol gets invalidated). Depending on how it's being used it might not
> > be a
> > problem, but in the general scenario I think it's not thread-safe.
>
> Yes, that function is also problematic.

This issue is easier to solve than expected since the map is returned as **mapp,
so it's just a matter of making sure that the caller always passes it and then
puts the refcnt.

Thanks,
Riccardo

>
> Thanks for looking into this, please consider sending patches for these
> issues,
>
> - Arnaldo


2021-06-18 16:11:16

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH] perf ksymbol: fix memory leak: decrease refcount of map and dso

Em Fri, Jun 18, 2021 at 12:01:28PM +0200, Riccardo Mancini escreveu:
> Hi,
>
> On Fri, 2021-06-04 at 15:29 -0300, Arnaldo Carvalho de Melo wrote:
> <SNIP>?
> > > > But looking at this code now I realize that maps__find() should grab a
> > > > refcount for the map it returns, because in this
> > > > machine__process_ksymbol_register() function we use reference that 'map'
> > > > after the if block, i.e. we use it if it came from maps__find() or if we
> > > > created it machine__process_ksymbol_register, so there is a possible
> > > > race where other thread removes it from the list and map__put()s it
> > > > ending up in map__delete() while we still use it in
> > > > machine__process_ksymbol_register(), right?
> > >
> > > Agree. It should be placed before up_read to avoid races, right?
> >
> > Yes, we have to grab a refcount while we are sure its not going away,
> > then return that as the lookup result, whoever receives that refcounted
> > entry should use it and then drop the refcount.
> >
> > > Then we would need to see where it's called and add the appropriate
> > > map__put.
> >
> > yes
>
> This function has quite a number of callers (direct and indirect) so the the
> patch is becoming huge.
>
> One of these callers is thread__find_map, which returns an addr_location
> (actually it's an output argument). This addr_location holds references to map,
> maps and thread without getting any refcnt (actually in one function it gets it
> on the thread and a comment tells to put it once done). If I'm not wrong, this
> addr_location is never malloced (always a local variable) and, is should be
> present in parts of the code where there should be a refcnt on the thread.
> Therefore, maybe it does not get the refcnts since it assumes that thread (upon
> which depends maps and as a consequence map) is always refcnted in its context.
> However, I think that it should get all refcnts anyways for clarity and to
> prevent possible misuses (if I understood correctly, Ian is of the same
> opinion).

agreed, but this will incur extra costs, we should perhaos use perf to
measure how much it costs. :-)

> My solution would be to add the refcnt grabbing for map, maps and thread in
> thread__find_map, releasing them in addr_location__put, and then making sure all
> callers call it when no longer in use.

Ok

> Following the same reasoning, I added refcnt grabbing also to mem_info,
> branch_info (map was already refcnted, I added it also to maps for coherency),
> map_symbol (as in branch_info, I added it to maps), and in other places in which
> I saw a pointer was passed without refcounting.
>
> Most changes are quite trivial, however, the changelog is huge:
> 48 files changed, 472 insertions(+), 157 deletions(-)
> Most of them are just returns converted to goto for calling the __put functions.

So you could first do a prep patch converting functions to have gotos,
which would be a no-logic change, and then do the rest?

> Doing so, I managed to remove memory leaks caused by refcounting also in perf-
> report (I wanted to try also perf top but I encountered another memory-related
> issue). However, the changelog is huge and testing all of it is challenging

So we should break it in as many small steps as possible, knowing that
each step is fixing just one of the problems, i.e. aSAN will continue
reporting problems, but less problems as you go on adding more fixes.

> (especially since I can test missing puts only with ASan's LeakSanitizer and its
> reports are usually full of leaks, which I am trying to fix along the way, I
> will send some patches in the following days). How would you go about it? Do you
> have any suggestions?

See above, thanks for working on this!

- Arnaldo