2022-09-21 11:50:03

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH net] net: mvpp2: debugfs: fix problem with previous memory leak fix

In commit fe2c9c61f668 ("net: mvpp2: debugfs: fix memory leak when using
debugfs_lookup()"), if the module is unloaded, the directory will still
be present if the module is loaded again and creating the directory will
fail, causing the creation of additional child debugfs entries for the
individual devices to fail.

As this module never cleaned up the root directory it created, even when
loaded, and unloading/loading a module is not a normal operation, none
of would normally happen.

To clean all of this up, use a tiny reference counted structure to hold
the root debugfs directory for the driver, and then clean it up when the
last user of it is removed from the system. This should resolve the
previously reported problems, and the original memory leak that
fe2c9c61f668 ("net: mvpp2: debugfs: fix memory leak when using
debugfs_lookup()") was attempting to fix.

Reported-by: Russell King <[email protected]>
Cc: Marcin Wojtas <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: [email protected]
Cc: stable <[email protected]>
Fixes: fe2c9c61f668 ("net: mvpp2: debugfs: fix memory leak when using debugfs_lookup()")
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
Note, test-built only, I do not have access to this hardware so please
review it for any foolish mistakes I might have again made.

.../ethernet/marvell/mvpp2/mvpp2_debugfs.c | 32 ++++++++++++++++---
1 file changed, 28 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_debugfs.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_debugfs.c
index 0eec05d905eb..16c303048f25 100644
--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_debugfs.c
+++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_debugfs.c
@@ -691,23 +691,47 @@ static int mvpp2_dbgfs_port_init(struct dentry *parent,
return 0;
}

+static struct mvpp2_debug_dir {
+ struct dentry *dir;
+ struct kref kref;
+} *mvpp2_root;
+
+static void mvpp2_release(struct kref *kref)
+{
+ struct mvpp2_debug_dir *mvpp2_root = container_of(kref, struct mvpp2_debug_dir, kref);
+
+ debugfs_remove(mvpp2_root->dir);
+ kfree(mvpp2_root);
+}
+
void mvpp2_dbgfs_cleanup(struct mvpp2 *priv)
{
debugfs_remove_recursive(priv->dbgfs_dir);

kfree(priv->dbgfs_entries);
+
+ if (mvpp2_root)
+ kref_put(&mvpp2_root->kref, mvpp2_release);
}

void mvpp2_dbgfs_init(struct mvpp2 *priv, const char *name)
{
- static struct dentry *mvpp2_root;
struct dentry *mvpp2_dir;
int ret, i;

- if (!mvpp2_root)
- mvpp2_root = debugfs_create_dir(MVPP2_DRIVER_NAME, NULL);
+ if (!mvpp2_root) {
+ mvpp2_root = kzalloc(sizeof(mvpp2_root), GFP_KERNEL);
+ if (!mvpp2_root) {
+ mvpp2_root = NULL;
+ return;
+ }
+ mvpp2_root->dir = debugfs_create_dir(MVPP2_DRIVER_NAME, NULL);
+ kref_init(&mvpp2_root->kref);
+ } else {
+ kref_get(&mvpp2_root->kref);
+ }

- mvpp2_dir = debugfs_create_dir(name, mvpp2_root);
+ mvpp2_dir = debugfs_create_dir(name, mvpp2_root->dir);

priv->dbgfs_dir = mvpp2_dir;
priv->dbgfs_entries = kzalloc(sizeof(*priv->dbgfs_entries), GFP_KERNEL);
--
2.37.3


2022-09-22 16:32:02

by Russell King (Oracle)

[permalink] [raw]
Subject: Re: [PATCH net] net: mvpp2: debugfs: fix problem with previous memory leak fix

On Wed, Sep 21, 2022 at 01:44:44PM +0200, Greg Kroah-Hartman wrote:
> In commit fe2c9c61f668 ("net: mvpp2: debugfs: fix memory leak when using
> debugfs_lookup()"), if the module is unloaded, the directory will still
> be present if the module is loaded again and creating the directory will
> fail, causing the creation of additional child debugfs entries for the
> individual devices to fail.
>
> As this module never cleaned up the root directory it created, even when
> loaded, and unloading/loading a module is not a normal operation, none
> of would normally happen.
>
> To clean all of this up, use a tiny reference counted structure to hold
> the root debugfs directory for the driver, and then clean it up when the
> last user of it is removed from the system. This should resolve the
> previously reported problems, and the original memory leak that
> fe2c9c61f668 ("net: mvpp2: debugfs: fix memory leak when using
> debugfs_lookup()") was attempting to fix.

For the record... I have a better fix for this, but I haven't been able
to get it into a state suitable for submission yet.

http://www.home.armlinux.org.uk/~rmk/misc/mvpp2-debugfs.diff

Not yet against the net tree. Might have time tomorrow to do that, not
sure at the moment. Medical stuff is getting in the way. :(

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

2022-09-22 17:30:02

by Marcin Wojtas

[permalink] [raw]
Subject: Re: [PATCH net] net: mvpp2: debugfs: fix problem with previous memory leak fix

Hi,

Thank you both for the patches.


czw., 22 wrz 2022 o 18:05 Russell King (Oracle)
<[email protected]> napisał(a):
>
> On Wed, Sep 21, 2022 at 01:44:44PM +0200, Greg Kroah-Hartman wrote:
> > In commit fe2c9c61f668 ("net: mvpp2: debugfs: fix memory leak when using
> > debugfs_lookup()"), if the module is unloaded, the directory will still
> > be present if the module is loaded again and creating the directory will
> > fail, causing the creation of additional child debugfs entries for the
> > individual devices to fail.
> >
> > As this module never cleaned up the root directory it created, even when
> > loaded, and unloading/loading a module is not a normal operation, none
> > of would normally happen.
> >
> > To clean all of this up, use a tiny reference counted structure to hold
> > the root debugfs directory for the driver, and then clean it up when the
> > last user of it is removed from the system. This should resolve the
> > previously reported problems, and the original memory leak that
> > fe2c9c61f668 ("net: mvpp2: debugfs: fix memory leak when using
> > debugfs_lookup()") was attempting to fix.
>
> For the record... I have a better fix for this, but I haven't been able
> to get it into a state suitable for submission yet.
>
> http://www.home.armlinux.org.uk/~rmk/misc/mvpp2-debugfs.diff
>
> Not yet against the net tree. Might have time tomorrow to do that, not
> sure at the moment. Medical stuff is getting in the way. :(
>

I'd lean towards this version - it is a bit more compact. I'll try to
test that tomorrow or during the weekend.

Best regards,
Marcin

2022-09-25 00:27:05

by Marcin Wojtas

[permalink] [raw]
Subject: Re: [PATCH net] net: mvpp2: debugfs: fix problem with previous memory leak fix

Hi Russell,

czw., 22 wrz 2022 o 19:08 Marcin Wojtas <[email protected]> napisał(a):
>
> Hi,
>
> Thank you both for the patches.
>
>
> czw., 22 wrz 2022 o 18:05 Russell King (Oracle)
> <[email protected]> napisał(a):
> >
> > On Wed, Sep 21, 2022 at 01:44:44PM +0200, Greg Kroah-Hartman wrote:
> > > In commit fe2c9c61f668 ("net: mvpp2: debugfs: fix memory leak when using
> > > debugfs_lookup()"), if the module is unloaded, the directory will still
> > > be present if the module is loaded again and creating the directory will
> > > fail, causing the creation of additional child debugfs entries for the
> > > individual devices to fail.
> > >
> > > As this module never cleaned up the root directory it created, even when
> > > loaded, and unloading/loading a module is not a normal operation, none
> > > of would normally happen.
> > >
> > > To clean all of this up, use a tiny reference counted structure to hold
> > > the root debugfs directory for the driver, and then clean it up when the
> > > last user of it is removed from the system. This should resolve the
> > > previously reported problems, and the original memory leak that
> > > fe2c9c61f668 ("net: mvpp2: debugfs: fix memory leak when using
> > > debugfs_lookup()") was attempting to fix.
> >
> > For the record... I have a better fix for this, but I haven't been able
> > to get it into a state suitable for submission yet.
> >
> > http://www.home.armlinux.org.uk/~rmk/misc/mvpp2-debugfs.diff
> >
> > Not yet against the net tree. Might have time tomorrow to do that, not
> > sure at the moment. Medical stuff is getting in the way. :(
> >
>
> I'd lean towards this version - it is a bit more compact. I'll try to
> test that tomorrow or during the weekend.
>

I improved the patch compile and work (tested on my CN913x board).
Feel free to submit (if you wish, I can do it too - please let me know
your preference):
https://github.com/semihalf-wojtas-marcin/Linux-Kernel/commit/0abb75115ffb2772f595bb3346573e27e650018b

Best regards,
Marcin

2022-09-25 07:08:28

by Russell King (Oracle)

[permalink] [raw]
Subject: Re: [PATCH net] net: mvpp2: debugfs: fix problem with previous memory leak fix

On Sun, Sep 25, 2022 at 01:27:06AM +0200, Marcin Wojtas wrote:
> Hi Russell,
>
> czw., 22 wrz 2022 o 19:08 Marcin Wojtas <[email protected]> napisał(a):
> >
> > Hi,
> >
> > Thank you both for the patches.
> >
> >
> > czw., 22 wrz 2022 o 18:05 Russell King (Oracle)
> > <[email protected]> napisał(a):
> > >
> > > On Wed, Sep 21, 2022 at 01:44:44PM +0200, Greg Kroah-Hartman wrote:
> > > > In commit fe2c9c61f668 ("net: mvpp2: debugfs: fix memory leak when using
> > > > debugfs_lookup()"), if the module is unloaded, the directory will still
> > > > be present if the module is loaded again and creating the directory will
> > > > fail, causing the creation of additional child debugfs entries for the
> > > > individual devices to fail.
> > > >
> > > > As this module never cleaned up the root directory it created, even when
> > > > loaded, and unloading/loading a module is not a normal operation, none
> > > > of would normally happen.
> > > >
> > > > To clean all of this up, use a tiny reference counted structure to hold
> > > > the root debugfs directory for the driver, and then clean it up when the
> > > > last user of it is removed from the system. This should resolve the
> > > > previously reported problems, and the original memory leak that
> > > > fe2c9c61f668 ("net: mvpp2: debugfs: fix memory leak when using
> > > > debugfs_lookup()") was attempting to fix.
> > >
> > > For the record... I have a better fix for this, but I haven't been able
> > > to get it into a state suitable for submission yet.
> > >
> > > http://www.home.armlinux.org.uk/~rmk/misc/mvpp2-debugfs.diff
> > >
> > > Not yet against the net tree. Might have time tomorrow to do that, not
> > > sure at the moment. Medical stuff is getting in the way. :(
> > >
> >
> > I'd lean towards this version - it is a bit more compact. I'll try to
> > test that tomorrow or during the weekend.
> >
>
> I improved the patch compile and work (tested on my CN913x board).
> Feel free to submit (if you wish, I can do it too - please let me know
> your preference):
> https://github.com/semihalf-wojtas-marcin/Linux-Kernel/commit/0abb75115ffb2772f595bb3346573e27e650018b

I don't see what the compile fixes were in that - it looks like my patch
ported onto current -net. Obvious changes:

- moved mvpp2_dbgfs_exit() declaration from after mvpp2_dbgfs_cleanup()
to before.
- moved definition of mvpp2_root to the top of the file (as no effect
on the code.)

and the change to port it:

- dropped my mvpp2_dbgfs_init() hunk (because it's different in -net)
- removed static declaration of mvpp2_root in mvpp2_dbgfs_init()

I'm not seeing any other changes.

Note that Sasha has submitted a revert of Greg's original patch for
mainline, so my original patch should apply as-is if that revert
happens - and I don't see any compile issues with it.

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

2022-09-25 07:19:51

by Russell King (Oracle)

[permalink] [raw]
Subject: Re: [PATCH net] net: mvpp2: debugfs: fix problem with previous memory leak fix

On Sun, Sep 25, 2022 at 07:58:29AM +0100, Russell King (Oracle) wrote:
> On Sun, Sep 25, 2022 at 01:27:06AM +0200, Marcin Wojtas wrote:
> > Hi Russell,
> >
> > I improved the patch compile and work (tested on my CN913x board).
> > Feel free to submit (if you wish, I can do it too - please let me know
> > your preference):
> > https://github.com/semihalf-wojtas-marcin/Linux-Kernel/commit/0abb75115ffb2772f595bb3346573e27e650018b
>
> I don't see what the compile fixes were in that - it looks like my patch
> ported onto current -net. Obvious changes:
>
> - moved mvpp2_dbgfs_exit() declaration from after mvpp2_dbgfs_cleanup()
> to before.
> - moved definition of mvpp2_root to the top of the file (as no effect
> on the code.)
>
> and the change to port it:
>
> - dropped my mvpp2_dbgfs_init() hunk (because it's different in -net)
> - removed static declaration of mvpp2_root in mvpp2_dbgfs_init()
>
> I'm not seeing any other changes.
>
> Note that Sasha has submitted a revert of Greg's original patch for
> mainline, so my original patch should apply as-is if that revert
> happens - and I don't see any compile issues with it.

On that - it seems the Stable kernel maintainers can't cope with being
told not to apply a patch - we've had a big long discussion about it
on IRC over the past few days.

Sasha states that the mainline process is broken - and as long as Greg's
original patch is in place, stable will repeatedly attempt to backport
it no matter whether a proper fix is being worked on, whether
maintainers are busy, and so on and so forth. The only way to stop
stable backporting patches is to revert them in mainline - as I asked to
happen in this case on the 13th. So it's all our fault for not
reverting the patch. It's got nothing to do with stable maintainers
unable to keep track of which patches they shouldn't be picking up.

I maintain that the stable kernel process is totally broken due to
this - it makes no allowance for whether maintainers can sort the
problem out in a timely manner.

Quite honestly, I now regard the stable kernel process as being
utterly broken. In my mind, it's not a stable kernel. It's an unstable
kernel with randomly applied patches that may or may not be appropriate.
Certainly, stable-kernel-rules.rst is a total and utter joke - they
aren't rules at all, the stable kernel maintainers don't have any rules
about patches they backport.

Anyway, I guess we're going to have to wait for Sasha's revert to be
merged into -net first, which will make backporting our true and proper
memory-leak fix a lot easier.

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

2022-09-25 08:14:45

by Marcin Wojtas

[permalink] [raw]
Subject: Re: [PATCH net] net: mvpp2: debugfs: fix problem with previous memory leak fix

niedz., 25 wrz 2022 o 08:58 Russell King (Oracle)
<[email protected]> napisał(a):
>
> On Sun, Sep 25, 2022 at 01:27:06AM +0200, Marcin Wojtas wrote:
> > Hi Russell,
> >
> > czw., 22 wrz 2022 o 19:08 Marcin Wojtas <[email protected]> napisał(a):
> > >
> > > Hi,
> > >
> > > Thank you both for the patches.
> > >
> > >
> > > czw., 22 wrz 2022 o 18:05 Russell King (Oracle)
> > > <[email protected]> napisał(a):
> > > >
> > > > On Wed, Sep 21, 2022 at 01:44:44PM +0200, Greg Kroah-Hartman wrote:
> > > > > In commit fe2c9c61f668 ("net: mvpp2: debugfs: fix memory leak when using
> > > > > debugfs_lookup()"), if the module is unloaded, the directory will still
> > > > > be present if the module is loaded again and creating the directory will
> > > > > fail, causing the creation of additional child debugfs entries for the
> > > > > individual devices to fail.
> > > > >
> > > > > As this module never cleaned up the root directory it created, even when
> > > > > loaded, and unloading/loading a module is not a normal operation, none
> > > > > of would normally happen.
> > > > >
> > > > > To clean all of this up, use a tiny reference counted structure to hold
> > > > > the root debugfs directory for the driver, and then clean it up when the
> > > > > last user of it is removed from the system. This should resolve the
> > > > > previously reported problems, and the original memory leak that
> > > > > fe2c9c61f668 ("net: mvpp2: debugfs: fix memory leak when using
> > > > > debugfs_lookup()") was attempting to fix.
> > > >
> > > > For the record... I have a better fix for this, but I haven't been able
> > > > to get it into a state suitable for submission yet.
> > > >
> > > > http://www.home.armlinux.org.uk/~rmk/misc/mvpp2-debugfs.diff
> > > >
> > > > Not yet against the net tree. Might have time tomorrow to do that, not
> > > > sure at the moment. Medical stuff is getting in the way. :(
> > > >
> > >
> > > I'd lean towards this version - it is a bit more compact. I'll try to
> > > test that tomorrow or during the weekend.
> > >
> >
> > I improved the patch compile and work (tested on my CN913x board).
> > Feel free to submit (if you wish, I can do it too - please let me know
> > your preference):
> > https://github.com/semihalf-wojtas-marcin/Linux-Kernel/commit/0abb75115ffb2772f595bb3346573e27e650018b
>
> I don't see what the compile fixes were in that - it looks like my patch

I overlooked drivers/net/ethernet/marvell/mvpp2/mvpp2.h diff (I
applied patch manually), but I double-checked and it was there, so you
are right.

> ported onto current -net. Obvious changes:

Yes, it's applied on top of net/net-next.

>
> - moved mvpp2_dbgfs_exit() declaration from after mvpp2_dbgfs_cleanup()
> to before.
> - moved definition of mvpp2_root to the top of the file (as no effect
> on the code.)

These are irrelevant. The most important part was:

--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
+++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
@@ -7714,8 +7714,8 @@ module_init(mvpp2_driver_init);

static void __exit mvpp2_driver_exit(void)
{
- mvpp2_dbgfs_exit();
platform_driver_unregister(&mvpp2_driver);
+ mvpp2_dbgfs_exit();
}
module_exit(mvpp2_driver_exit);

Otherwise, I observed NULL-pointer dereference when
mvpp2_remove()->mvpp2_dbgfs_cleanup().

>
> and the change to port it:
>
> - dropped my mvpp2_dbgfs_init() hunk (because it's different in -net)
> - removed static declaration of mvpp2_root in mvpp2_dbgfs_init()
>
> I'm not seeing any other changes.
>
> Note that Sasha has submitted a revert of Greg's original patch for
> mainline, so my original patch should apply as-is if that revert
> happens - and I don't see any compile issues with it.
>

Sure. In order to avoid misunderstandings, I stop here and leave the
submission to you.

Thanks,
Marcin