2022-03-10 14:21:11

by Yaliang Wang

[permalink] [raw]
Subject: [PATCH] MIPS: pgalloc: fix memory leak caused by pgd_free()

From: Yaliang Wang <[email protected]>

pgd page is freed by generic implementation pgd_free() since commit
f9cb654cb550 ("asm-generic: pgalloc: provide generic pgd_free()"),
however, there are scenarios that the system uses more than one page as
the pgd table, in such cases the generic implementation pgd_free() won't
be applicable anymore. For example, when PAGE_SIZE_4KB is enabled and
MIPS_VA_BITS_48 is not enabled in a 64bit system, the macro "PGD_ORDER"
will be set as "1", which will cause allocating two pages as the pgd
table. Well, at the same time, the generic implementation pgd_free()
just free one pgd page, which will result in the memory leak.

The memory leak can be easily detected by executing shell command:
"while true; do ls > /dev/null; grep MemFree /proc/meminfo; done"

Fixes: f9cb654cb550 ("asm-generic: pgalloc: provide generic pgd_free()")
Signed-off-by: Yaliang Wang <[email protected]>
---
arch/mips/include/asm/pgalloc.h | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/arch/mips/include/asm/pgalloc.h b/arch/mips/include/asm/pgalloc.h
index c7925d0e9874..867e9c3db76e 100644
--- a/arch/mips/include/asm/pgalloc.h
+++ b/arch/mips/include/asm/pgalloc.h
@@ -15,6 +15,7 @@

#define __HAVE_ARCH_PMD_ALLOC_ONE
#define __HAVE_ARCH_PUD_ALLOC_ONE
+#define __HAVE_ARCH_PGD_FREE
#include <asm-generic/pgalloc.h>

static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmd,
@@ -48,6 +49,11 @@ static inline void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd)
extern void pgd_init(unsigned long page);
extern pgd_t *pgd_alloc(struct mm_struct *mm);

+static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
+{
+ free_pages((unsigned long)pgd, PGD_ORDER);
+}
+
#define __pte_free_tlb(tlb,pte,address) \
do { \
pgtable_pte_page_dtor(pte); \
--
2.25.1


2022-03-14 16:33:02

by Thomas Bogendoerfer

[permalink] [raw]
Subject: Re: [PATCH] MIPS: pgalloc: fix memory leak caused by pgd_free()

On Thu, Mar 10, 2022 at 07:31:16PM +0800, [email protected] wrote:
> From: Yaliang Wang <[email protected]>
>
> pgd page is freed by generic implementation pgd_free() since commit
> f9cb654cb550 ("asm-generic: pgalloc: provide generic pgd_free()"),
> however, there are scenarios that the system uses more than one page as
> the pgd table, in such cases the generic implementation pgd_free() won't
> be applicable anymore. For example, when PAGE_SIZE_4KB is enabled and
> MIPS_VA_BITS_48 is not enabled in a 64bit system, the macro "PGD_ORDER"
> will be set as "1", which will cause allocating two pages as the pgd
> table. Well, at the same time, the generic implementation pgd_free()
> just free one pgd page, which will result in the memory leak.
>
> The memory leak can be easily detected by executing shell command:
> "while true; do ls > /dev/null; grep MemFree /proc/meminfo; done"
>
> Fixes: f9cb654cb550 ("asm-generic: pgalloc: provide generic pgd_free()")
> Signed-off-by: Yaliang Wang <[email protected]>
> ---
> arch/mips/include/asm/pgalloc.h | 6 ++++++
> 1 file changed, 6 insertions(+)

applied to mips-next.

Thomas.

--
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea. [ RFC1925, 2.3 ]

2022-04-02 17:00:32

by Maciej W. Rozycki

[permalink] [raw]
Subject: Re: [PATCH] MIPS: pgalloc: fix memory leak caused by pgd_free()

On Thu, 10 Mar 2022, [email protected] wrote:

> pgd page is freed by generic implementation pgd_free() since commit
> f9cb654cb550 ("asm-generic: pgalloc: provide generic pgd_free()"),
> however, there are scenarios that the system uses more than one page as
> the pgd table, in such cases the generic implementation pgd_free() won't
> be applicable anymore. For example, when PAGE_SIZE_4KB is enabled and
> MIPS_VA_BITS_48 is not enabled in a 64bit system, the macro "PGD_ORDER"
> will be set as "1", which will cause allocating two pages as the pgd
> table. Well, at the same time, the generic implementation pgd_free()
> just free one pgd page, which will result in the memory leak.
>
> The memory leak can be easily detected by executing shell command:
> "while true; do ls > /dev/null; grep MemFree /proc/meminfo; done"
>
> Fixes: f9cb654cb550 ("asm-generic: pgalloc: provide generic pgd_free()")
> Signed-off-by: Yaliang Wang <[email protected]>

As a critical regression shouldn't this have been marked for backporting
to stable branches?

Maciej

2022-04-04 21:17:49

by Andrew Powers-Holmes

[permalink] [raw]
Subject: Re: [PATCH] MIPS: pgalloc: fix memory leak caused by pgd_free()

On 3/4/2022 12:48 am, Maciej W. Rozycki wrote:
> On Thu, 10 Mar 2022, [email protected] wrote:
>
>> pgd page is freed by generic implementation pgd_free() since commit
>> f9cb654cb550 ("asm-generic: pgalloc: provide generic pgd_free()"),
>> however, there are scenarios that the system uses more than one page as
>> the pgd table, in such cases the generic implementation pgd_free() won't
>> be applicable anymore. For example, when PAGE_SIZE_4KB is enabled and
>> MIPS_VA_BITS_48 is not enabled in a 64bit system, the macro "PGD_ORDER"
>> will be set as "1", which will cause allocating two pages as the pgd
>> table. Well, at the same time, the generic implementation pgd_free()
>> just free one pgd page, which will result in the memory leak.
>>
>> The memory leak can be easily detected by executing shell command:
>> "while true; do ls > /dev/null; grep MemFree /proc/meminfo; done"
>>
>> Fixes: f9cb654cb550 ("asm-generic: pgalloc: provide generic pgd_free()")
>> Signed-off-by: Yaliang Wang <[email protected]>
>
> As a critical regression shouldn't this have been marked for backporting
> to stable branches?

Very yes please - this bug has been driving several of us at OpenWrt
crazy for quite[1] some[2] time now, mostly on Octeon devices. We'd
(wrongly) suspected the octeon-ethernet driver, but this morning finally
bisected it down to f9cb654cb550 and can confirm this patch fixes the
regression.

MIPS64 has essentially been broken/unusable for 8 kernel releases,
including two LTS kernels, since the original commit landed. Should
there not have been CI/tests that caught this? It's pretty major!

- Andrew

[1]
https://forum.openwrt.org/t/oom-killer-dnsmasq-when-physical-free-ram-remains/109351
[2]
https://forum.openwrt.org/t/upstream-kernel-memleak-5-10-octeon-ethernet-ko/111827

2022-04-04 22:52:13

by Maciej W. Rozycki

[permalink] [raw]
Subject: Re: [PATCH] MIPS: pgalloc: fix memory leak caused by pgd_free()

On Sun, 3 Apr 2022, Andrew Holmes wrote:

> MIPS64 has essentially been broken/unusable for 8 kernel releases,
> including two LTS kernels, since the original commit landed. Should
> there not have been CI/tests that caught this? It's pretty major!

AFAIK the MIPS port is only maintained on the best effort basis nowadays
I'm afraid. I.e. it's enthusiasts investing their free time for the joy
of fiddling with things. So things are bound to break from time to time
and remain unnoticed for a while. We're doing our best, but our resources
are limited.

Taking these limitations into account I think Thomas has been doing a
tremendous job maintaining the MIPS port, but he hasn't been cc-ed on the
submission of the original change and it's very easy to miss stuff in the
flood that has only been posted to a mailing list.

Maciej

2022-04-05 00:40:28

by Andrew Powers-Holmes

[permalink] [raw]
Subject: Re: [PATCH] MIPS: pgalloc: fix memory leak caused by pgd_free()

On 3/04/2022 8:37 pm, Maciej W. Rozycki wrote:
> AFAIK the MIPS port is only maintained on the best effort basis
> nowadays I'm afraid. I.e. it's enthusiasts investing their free time
> for the joy of fiddling with things. So things are bound to break
> from time to time and remain unnoticed for a while. We're doing our
> best, but our resources are limited.
>
> Taking these limitations into account I think Thomas has been doing a
> tremendous job maintaining the MIPS port, but he hasn't been cc-ed on
> the submission of the original change and it's very easy to miss
> stuff in the flood that has only been posted to a mailing list.
>
> Maciej

Fair enough :) apologies, didn't mean to sound combative or ungrateful.
I know there's far more work to go around than people to do it,
everyone's doing the best they can, and I have nothing but appreciation
for all the work the kernel community does.

It's just surprising that this *could* go unnoticed for over a year -
though I suppose most of the MIPS64 systems out there are running on one
or another old vendor SDK kernel so won't have been affected...

Would the best way to get this merged into 5.10/15 (and maybe .16 just
for good measure) be to email the stable team (since it's already in
Linus' tree)? Documentation/process/stable-kernel-rules seems to say
yes, but I'd like to avoid stepping on anyone's toes given that it's not
my patch.

- Andrew

2022-04-05 02:46:56

by Joshua Kinard

[permalink] [raw]
Subject: Re: [PATCH] MIPS: pgalloc: fix memory leak caused by pgd_free()

On 4/3/2022 06:37, Maciej W. Rozycki wrote:
> On Sun, 3 Apr 2022, Andrew Holmes wrote:
>
>> MIPS64 has essentially been broken/unusable for 8 kernel releases,
>> including two LTS kernels, since the original commit landed. Should
>> there not have been CI/tests that caught this? It's pretty major!
>
> AFAIK the MIPS port is only maintained on the best effort basis nowadays
> I'm afraid. I.e. it's enthusiasts investing their free time for the joy
> of fiddling with things. So things are bound to break from time to time
> and remain unnoticed for a while. We're doing our best, but our resources
> are limited.
>
> Taking these limitations into account I think Thomas has been doing a
> tremendous job maintaining the MIPS port, but he hasn't been cc-ed on the
> submission of the original change and it's very easy to miss stuff in the
> flood that has only been posted to a mailing list.
>
> Maciej
>

FWIW, hot off the presses is RFC9225:
https://datatracker.ietf.org/doc/html/rfc9225

4. Best Current Practises

1. Authors MUST NOT implement bugs.

2. If bugs are introduced in code, they MUST be clearly documented.

3. When implementing specifications that are broken by design, it is
RECOMMENDED to aggregate multiple smaller bugs into one larger
bug. This will be easier to document: rather than having a lot
of hard-to-track inconsequential bugs, there will be only a few
easy-to-recognise significant bugs.

4. The aphorism "It's not a bug, it's a feature" is considered rude.

5. Assume all external input is the result of (a series of) bugs.
(Especially in machine-to-machine applications such as
implementations of network protocols.)

6. In fact, assume all internal inputs also are the result of bugs.

--
Joshua Kinard
Gentoo/MIPS
[email protected]
rsa6144/5C63F4E3F5C6C943 2015-04-27
177C 1972 1FB8 F254 BAD0 3E72 5C63 F4E3 F5C6 C943

"The past tempts us, the present confuses us, the future frightens us. And
our lives slip away, moment by moment, lost in that vast, terrible in-between."

--Emperor Turhan, Centauri Republic

2022-04-05 03:25:15

by Donald Hoskins

[permalink] [raw]
Subject: Re: [PATCH] MIPS: pgalloc: fix memory leak caused by pgd_free()

Hi there,

This fix should backported.  Effectively, all mips64-based Cavium Octeon
processors have been broken since the original commit; this has been a
persistent and surreptitious issue for the OpenWrt community since
nearly a year ago (referencing
https://forum.openwrt.org/t/upstream-kernel-memleak-5-10-octeon-ethernet-ko/111827/),
and nearly ended with OpenWrt marking the target as entirely unsupported
and moving on.

We understand that there are ultimately few users of mips64, but it's
really important that our memory management work. While OpenWrt can
backport until upstream inclusion, I am sure there are other users and
software platforms unaware of this fix (and unable to take advantage of
it), thus making those platforms on this arch completely unusable.

2022-04-06 05:48:44

by Thomas Bogendoerfer

[permalink] [raw]
Subject: Re: [PATCH] MIPS: pgalloc: fix memory leak caused by pgd_free()

On Tue, Apr 05, 2022 at 10:42:21AM +0100, Maciej W. Rozycki wrote:
> On Mon, 4 Apr 2022, Andrew Powers-Holmes wrote:
>
> > Would the best way to get this merged into 5.10/15 (and maybe .16 just
> > for good measure) be to email the stable team (since it's already in
> > Linus' tree)? Documentation/process/stable-kernel-rules seems to say
> > yes, but I'd like to avoid stepping on anyone's toes given that it's not
> > my patch.
>
> You seem the most severely affected so far, so why not act in your best
> interest? I think option #2 applies here and seems quite straightforward
> to follow, referring commit 2bc5bab9a763 and using your use case as the
> justification. It doesn't have to be the author to request a backport.
>
> NB I think it has to be backported to all the stable branches made since
> the original breakage; i.e. v5.9+ (I haven't kept track of what they are).

the fix has a Fixes tag so it will usually ported to stable/longterm kernels.
I already saw it in Greg's patch bombs.

Thomas.

--
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea. [ RFC1925, 2.3 ]

2022-04-06 07:45:55

by Maciej W. Rozycki

[permalink] [raw]
Subject: Re: [PATCH] MIPS: pgalloc: fix memory leak caused by pgd_free()

On Mon, 4 Apr 2022, Andrew Powers-Holmes wrote:

> Fair enough :) apologies, didn't mean to sound combative or ungrateful.
> I know there's far more work to go around than people to do it,
> everyone's doing the best they can, and I have nothing but appreciation
> for all the work the kernel community does.

No offence taken; I just wanted to make it absolutely clear what the
situation currently is.

> It's just surprising that this *could* go unnoticed for over a year -
> though I suppose most of the MIPS64 systems out there are running on one
> or another old vendor SDK kernel so won't have been affected...

That's subject to the probability theory and depending on what people's
usage models are.

> Would the best way to get this merged into 5.10/15 (and maybe .16 just
> for good measure) be to email the stable team (since it's already in
> Linus' tree)? Documentation/process/stable-kernel-rules seems to say
> yes, but I'd like to avoid stepping on anyone's toes given that it's not
> my patch.

You seem the most severely affected so far, so why not act in your best
interest? I think option #2 applies here and seems quite straightforward
to follow, referring commit 2bc5bab9a763 and using your use case as the
justification. It doesn't have to be the author to request a backport.

NB I think it has to be backported to all the stable branches made since
the original breakage; i.e. v5.9+ (I haven't kept track of what they are).

Maciej

2022-04-06 10:11:11

by Maciej W. Rozycki

[permalink] [raw]
Subject: Re: [PATCH] MIPS: pgalloc: fix memory leak caused by pgd_free()

On Tue, 5 Apr 2022, Thomas Bogendoerfer wrote:

> > NB I think it has to be backported to all the stable branches made since
> > the original breakage; i.e. v5.9+ (I haven't kept track of what they are).
>
> the fix has a Fixes tag so it will usually ported to stable/longterm kernels.
> I already saw it in Greg's patch bombs.

Hmm, not all fixes qualify for or indeed are worth backporting and I'd
expect those that have no stable annotation to remain on trunk only. I
have been following this principle with my submissions anyway.

Indeed I can see a backport to 5.17 has literally just been posted in
this humongous patch set, but in this case I suspect Greg has just picked
this up by hand (thanks, Greg!) having seen this discussion (though how he
manages to escape alive through the flood of messages has been astonishing
me).

Maciej