2013-05-16 08:42:27

by Oskar Andero

[permalink] [raw]
Subject: [PATCH] mm: vmscan: handle any negative return value from scan_objects

The shrinkers must return -1 to indicate that it is busy. Instead, treat
any negative value as busy.
This fixes a potential bug if scan_objects returns a negative other than -1.

Cc: Glauber Costa <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Oskar Andero <[email protected]>
---
include/linux/shrinker.h | 7 ++++---
mm/vmscan.c | 2 +-
2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
index 3b08869..ced0e91 100644
--- a/include/linux/shrinker.h
+++ b/include/linux/shrinker.h
@@ -38,9 +38,10 @@ struct shrink_control {
* @scan_objects will only be called if @count_objects returned a positive
* value for the number of freeable objects. The callout should scan the cache
* and attempt to free items from the cache. It should then return the number of
- * objects freed during the scan, or -1 if progress cannot be made due to
- * potential deadlocks. If -1 is returned, then no further attempts to call the
- * @scan_objects will be made from the current reclaim context.
+ * objects freed during the scan, or a negative value if progress cannot be made
+ * due to potential deadlocks. If a negative value is returned, then no further
+ * attempts to call the @scan_objects will be made from the current reclaim
+ * context.
*/
struct shrinker {
long (*count_objects)(struct shrinker *, struct shrink_control *sc);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 6bac41e..acb4aef 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -291,7 +291,7 @@ shrink_slab_one(struct shrinker *shrinker, struct shrink_control *shrinkctl,

shrinkctl->nr_to_scan = nr_to_scan;
ret = shrinker->scan_objects(shrinker, shrinkctl);
- if (ret == -1)
+ if (ret < 0)
break;
freed += ret;

--
1.8.1.5


2013-05-16 11:52:31

by Dave Chinner

[permalink] [raw]
Subject: Re: [PATCH] mm: vmscan: handle any negative return value from scan_objects

On Thu, May 16, 2013 at 10:42:16AM +0200, Oskar Andero wrote:
> The shrinkers must return -1 to indicate that it is busy. Instead, treat
> any negative value as busy.

Why? The API defines return condition for aborting a scan and gives
a specific value for doing that. i.e. explain why should change the
API to over-specify the 'abort scan" return value like this.

FWIW, using "any" negative number for "abort scan" is a bad API
design decision. It means that in future we can't introduce
different negative return values in the API if we have a new to.
i.e. each specific negative return value needs to have the potential
for defining a different behaviour.

So if any change needs to be made, it is to change the -1 return
value to an enum and have the shrinkers return that enum when they
want an abort.

-Dave.
--
Dave Chinner
[email protected]

2013-05-16 12:27:56

by Oskar Andero

[permalink] [raw]
Subject: Re: [PATCH] mm: vmscan: handle any negative return value from scan_objects

On 13:52 Thu 16 May , Dave Chinner wrote:
> On Thu, May 16, 2013 at 10:42:16AM +0200, Oskar Andero wrote:
> > The shrinkers must return -1 to indicate that it is busy. Instead, treat
> > any negative value as busy.
>
> Why? The API defines return condition for aborting a scan and gives
> a specific value for doing that. i.e. explain why should change the
> API to over-specify the 'abort scan" return value like this.

As I pointed out earlier, looking in to the code (from master):
if (shrink_ret == -1)
break;
if (shrink_ret < nr_before)
ret += nr_before - shrink_ret;

This piece of code lacks a sanity check and will only function if shrink_ret
is either greater than zero or exactly -1. If shrink_ret is e.g. -2 this will
lead to undefined behaviour.

> FWIW, using "any" negative number for "abort scan" is a bad API
> design decision. It means that in future we can't introduce
> different negative return values in the API if we have a new to.
> i.e. each specific negative return value needs to have the potential
> for defining a different behaviour.

An alternative to my patch would be to add:
if (shrink_ret < -1)
/* handle illegal return code in some way */

> So if any change needs to be made, it is to change the -1 return
> value to an enum and have the shrinkers return that enum when they
> want an abort.

I am all for an enum, but I still believe we should handle the case where
the shrinkers return something wicked.

-Oskar

2013-05-17 06:33:54

by Dave Chinner

[permalink] [raw]
Subject: Re: [PATCH] mm: vmscan: handle any negative return value from scan_objects

On Thu, May 16, 2013 at 02:27:52PM +0200, Oskar Andero wrote:
> On 13:52 Thu 16 May , Dave Chinner wrote:
> > On Thu, May 16, 2013 at 10:42:16AM +0200, Oskar Andero wrote:
> > > The shrinkers must return -1 to indicate that it is busy. Instead, treat
> > > any negative value as busy.
> >
> > Why? The API defines return condition for aborting a scan and gives
> > a specific value for doing that. i.e. explain why should change the
> > API to over-specify the 'abort scan" return value like this.
>
> As I pointed out earlier, looking in to the code (from master):
> if (shrink_ret == -1)
> break;
> if (shrink_ret < nr_before)
> ret += nr_before - shrink_ret;
>
> This piece of code lacks a sanity check and will only function if shrink_ret
> is either greater than zero or exactly -1. If shrink_ret is e.g. -2 this will
> lead to undefined behaviour.

If a shrinker is returning -2 then the shrinker is broken and needs
fixing.

> > FWIW, using "any" negative number for "abort scan" is a bad API
> > design decision. It means that in future we can't introduce
> > different negative return values in the API if we have a new to.
> > i.e. each specific negative return value needs to have the potential
> > for defining a different behaviour.
>
> An alternative to my patch would be to add:
> if (shrink_ret < -1)
> /* handle illegal return code in some way */

How? We have one valid negative return code. WTF are we supposed to
do if a shrinker is passing undefined return values? IOWs, the only
sane thing to do is:

BUG_ON(shrink_ret < -1);

> > So if any change needs to be made, it is to change the -1 return
> > value to an enum and have the shrinkers return that enum when they
> > want an abort.
>
> I am all for an enum, but I still believe we should handle the case where
> the shrinkers return something wicked.

Which bit of "broken shrinkers need to be fixed" don't you
understand? A BUG_ON() will make sure they get fixed - anything else
that allows broken shrinkers to continue functioning is a completely
unacceptible solution.

-Dave.
--
Dave Chinner
[email protected]

2013-05-17 08:00:58

by Oskar Andero

[permalink] [raw]
Subject: Re: [PATCH] mm: vmscan: handle any negative return value from scan_objects

On 08:33 Fri 17 May , Dave Chinner wrote:
> On Thu, May 16, 2013 at 02:27:52PM +0200, Oskar Andero wrote:
> > On 13:52 Thu 16 May , Dave Chinner wrote:
> > > On Thu, May 16, 2013 at 10:42:16AM +0200, Oskar Andero wrote:
> > > > The shrinkers must return -1 to indicate that it is busy. Instead, treat
> > > > any negative value as busy.
> > >
> > > Why? The API defines return condition for aborting a scan and gives
> > > a specific value for doing that. i.e. explain why should change the
> > > API to over-specify the 'abort scan" return value like this.
> >
> > As I pointed out earlier, looking in to the code (from master):
> > if (shrink_ret == -1)
> > break;
> > if (shrink_ret < nr_before)
> > ret += nr_before - shrink_ret;
> >
> > This piece of code lacks a sanity check and will only function if shrink_ret
> > is either greater than zero or exactly -1. If shrink_ret is e.g. -2 this will
> > lead to undefined behaviour.
>
> If a shrinker is returning -2 then the shrinker is broken and needs
> fixing.

The point is: returning -2 is just as magical and meaningful as returning -1.

Usually, returning a negative means "failure" (Chapter 16 CodingStyle), not
a perfectly valid "abort scan" as in this piece of code.

> > > FWIW, using "any" negative number for "abort scan" is a bad API
> > > design decision. It means that in future we can't introduce
> > > different negative return values in the API if we have a new to.
> > > i.e. each specific negative return value needs to have the potential
> > > for defining a different behaviour.
> >
> > An alternative to my patch would be to add:
> > if (shrink_ret < -1)
> > /* handle illegal return code in some way */
>
> How? We have one valid negative return code. WTF are we supposed to
> do if a shrinker is passing undefined return values? IOWs, the only
> sane thing to do is:
>
> BUG_ON(shrink_ret < -1);

Yes, of course! BUG_ON() is the proper way to handle an illegal value.
Now we are getting somewhere!

> > > So if any change needs to be made, it is to change the -1 return
> > > value to an enum and have the shrinkers return that enum when they
> > > want an abort.
> >
> > I am all for an enum, but I still believe we should handle the case where
> > the shrinkers return something wicked.
>
> Which bit of "broken shrinkers need to be fixed" don't you
> understand? A BUG_ON() will make sure they get fixed - anything else
> that allows broken shrinkers to continue functioning is a completely
> unacceptible solution.

BUG_ON() is perfect IMO and if everyone is ok with that I will send
version 2 of my patch.

Now there is just the matter of returning hardcoded -1. Would an enum in
shrinker.h add any value? I have gotten different feedback on this - some
say yea, others nay.
I think I have motivated it enough in this thread, so I am not going to
repeat myself.

-Oskar