shrink_slab() queries each slab cache to get the number of
elements in it. In most cases such queries are cheap but,
on some caches. For example, Android low-memory-killer,
which is operates as a slab shrinker, does relatively
long calculation once invoked and it is quite expensive.
This patch removes redundant queries to shrinker function
in the loop of shrink batch.
Signed-off-by: Heesub Shin <[email protected]>
---
mm/vmscan.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index fa6a853..11b6695 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -282,9 +282,8 @@ unsigned long shrink_slab(struct shrink_control *shrink,
max_pass, delta, total_scan);
while (total_scan >= batch_size) {
- int nr_before;
+ int nr_before = max_pass;
- nr_before = do_shrinker_shrink(shrinker, shrink, 0);
shrink_ret = do_shrinker_shrink(shrinker, shrink,
batch_size);
if (shrink_ret == -1)
@@ -293,6 +292,7 @@ unsigned long shrink_slab(struct shrink_control *shrink,
ret += nr_before - shrink_ret;
count_vm_events(SLABS_SCANNED, batch_size);
total_scan -= batch_size;
+ max_pass = shrink_ret;
cond_resched();
}
--
1.8.3.1
Hello,
On Fri, Jun 14, 2013 at 07:07:51PM +0900, Heesub Shin wrote:
> shrink_slab() queries each slab cache to get the number of
> elements in it. In most cases such queries are cheap but,
> on some caches. For example, Android low-memory-killer,
> which is operates as a slab shrinker, does relatively
> long calculation once invoked and it is quite expensive.
LMK as shrinker is really bad, which everybody didn't want
when we reviewed it a few years ago so that's a one of reason
LMK couldn't be promoted to mainline yet. So your motivation is
already not atrractive. ;-)
>
> This patch removes redundant queries to shrinker function
> in the loop of shrink batch.
I didn't review the patch and others don't want it, I guess.
Because slab shrink is under construction and many patches were
already merged into mmtom. Please look at latest mmotm tree.
git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
If you concern is still in there and it's really big concern of MM
we should take care, NOT LMK, plese, resend it.
Thanks.
--
Kind regards,
Minchan Kim
On Sat, 15 Jun 2013 03:13:26 +0900 HeeSub Shin <[email protected]> wrote:
> Hello,
>
> On Fri, Jun 14, 2013 at 8:10 PM, Minchan Kim <[email protected]> wrote:
>
> >
> > Hello,
> >
> > On Fri, Jun 14, 2013 at 07:07:51PM +0900, Heesub Shin wrote:
> > > shrink_slab() queries each slab cache to get the number of
> > > elements in it. In most cases such queries are cheap but,
> > > on some caches. For example, Android low-memory-killer,
> > > which is operates as a slab shrinker, does relatively
> > > long calculation once invoked and it is quite expensive.
> >
> > LMK as shrinker is really bad, which everybody didn't want
> > when we reviewed it a few years ago so that's a one of reason
> > LMK couldn't be promoted to mainline yet. So your motivation is
> > already not atrractive. ;-)
> >
> > >
> > > This patch removes redundant queries to shrinker function
> > > in the loop of shrink batch.
> >
> > I didn't review the patch and others don't want it, I guess.
> > Because slab shrink is under construction and many patches were
> > already merged into mmtom. Please look at latest mmotm tree.
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
>
>
> >
> > If you concern is still in there and it's really big concern of MM
> > we should take care, NOT LMK, plese, resend it.
> >
> >
> I've noticed that there are huge changes there in the recent mmotm and you
> guys already settled the issue of my concern. I usually keep track changes
> in recent mm-tree, but this time I didn't. My bad :-)
>
I'm not averse to merging an improvement like this even if it gets
rubbed out by forthcoming changes. The big changes may never get
merged or may be reverted. And by merging this patch, others are more
likely to grab it, backport it into earlier kernels and benefit from
it.
Also, the problem which this simple patch fixes might be present in a
different form after the large patchset has been merged. That does not
appear to be the case this time.
So I'd actually like to merge Heesub's patch. Problem is, I don't have
a way to redistribute it for testing - I'd need to effectively revert
the whole thing when integrating Glauber's stuff on top, so nobody who
is using linux-next would test Heesub's change. Drat.
However I'm a bit sceptical about the description here. The shrinker
is supposed to special-case the "nr_to_scan == 0" case and AFAICT
drivers/staging/android/lowmemorykiller.c:lowmem_shrink() does do this,
and it looks like the function will be pretty quick in this case.
In other words, the behaviour of lowmem_shrink(nr_to_scan == 0) does
not match Heesub's description. What's up with that?
Also, there is an obvious optimisation which we could make to
lowmem_shrink(). All this stuff:
if (lowmem_adj_size < array_size)
array_size = lowmem_adj_size;
if (lowmem_minfree_size < array_size)
array_size = lowmem_minfree_size;
for (i = 0; i < array_size; i++) {
if (other_free < lowmem_minfree[i] &&
other_file < lowmem_minfree[i]) {
min_score_adj = lowmem_adj[i];
break;
}
}
does nothing useful in the nr_to_scan==0 case and should be omitted for
this special case. But this problem was fixed in the large shrinker
rework in -mm.
On Sat, Jun 15, 2013 at 8:04 AM, Andrew Morton
<[email protected]> wrote:
>
> On Sat, 15 Jun 2013 03:13:26 +0900 HeeSub Shin <[email protected]> wrote:
>
> > Hello,
> >
> > On Fri, Jun 14, 2013 at 8:10 PM, Minchan Kim <[email protected]> wrote:
> >
> > >
> > > Hello,
> > >
> > > On Fri, Jun 14, 2013 at 07:07:51PM +0900, Heesub Shin wrote:
> > > > shrink_slab() queries each slab cache to get the number of
> > > > elements in it. In most cases such queries are cheap but,
> > > > on some caches. For example, Android low-memory-killer,
> > > > which is operates as a slab shrinker, does relatively
> > > > long calculation once invoked and it is quite expensive.
> > >
> > > LMK as shrinker is really bad, which everybody didn't want
> > > when we reviewed it a few years ago so that's a one of reason
> > > LMK couldn't be promoted to mainline yet. So your motivation is
> > > already not atrractive. ;-)
> > >
> > > >
> > > > This patch removes redundant queries to shrinker function
> > > > in the loop of shrink batch.
> > >
> > > I didn't review the patch and others don't want it, I guess.
> > > Because slab shrink is under construction and many patches were
> > > already merged into mmtom. Please look at latest mmotm tree.
> > >
> > > git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
> >
> >
> > >
> > > If you concern is still in there and it's really big concern of MM
> > > we should take care, NOT LMK, plese, resend it.
> > >
> > >
> > I've noticed that there are huge changes there in the recent mmotm and
> > you
> > guys already settled the issue of my concern. I usually keep track
> > changes
> > in recent mm-tree, but this time I didn't. My bad :-)
> >
>
> I'm not averse to merging an improvement like this even if it gets
> rubbed out by forthcoming changes. The big changes may never get
> merged or may be reverted. And by merging this patch, others are more
> likely to grab it, backport it into earlier kernels and benefit from
> it.
>
> Also, the problem which this simple patch fixes might be present in a
> different form after the large patchset has been merged. That does not
> appear to be the case this time.
>
> So I'd actually like to merge Heesub's patch. Problem is, I don't have
> a way to redistribute it for testing - I'd need to effectively revert
> the whole thing when integrating Glauber's stuff on top, so nobody who
> is using linux-next would test Heesub's change. Drat.
>
>
>
>
> However I'm a bit sceptical about the description here. The shrinker
> is supposed to special-case the "nr_to_scan == 0" case and AFAICT
> drivers/staging/android/lowmemorykiller.c:lowmem_shrink() does do this,
> and it looks like the function will be pretty quick in this case.
>
> In other words, the behaviour of lowmem_shrink(nr_to_scan == 0) does
> not match Heesub's description. What's up with that?
>
Right, but real use case is differnet at mainline kernel. and he found it.
there are two approaches,
1. Reduce do_shinker_slab call by this patch
2. Optimize shinker function itself as like this.
Thank you,
Kyungmin Park
>
>
> Also, there is an obvious optimisation which we could make to
> lowmem_shrink(). All this stuff:
>
> if (lowmem_adj_size < array_size)
> array_size = lowmem_adj_size;
> if (lowmem_minfree_size < array_size)
> array_size = lowmem_minfree_size;
> for (i = 0; i < array_size; i++) {
> if (other_free < lowmem_minfree[i] &&
> other_file < lowmem_minfree[i]) {
> min_score_adj = lowmem_adj[i];
> break;
> }
> }
>
> does nothing useful in the nr_to_scan==0 case and should be omitted for
> this special case. But this problem was fixed in the large shrinker
> rework in -mm.
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected]. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
Hello Andrew,
On Fri, Jun 14, 2013 at 04:04:25PM -0700, Andrew Morton wrote:
> On Sat, 15 Jun 2013 03:13:26 +0900 HeeSub Shin <[email protected]> wrote:
>
> > Hello,
> >
> > On Fri, Jun 14, 2013 at 8:10 PM, Minchan Kim <[email protected]> wrote:
> >
> > >
> > > Hello,
> > >
> > > On Fri, Jun 14, 2013 at 07:07:51PM +0900, Heesub Shin wrote:
> > > > shrink_slab() queries each slab cache to get the number of
> > > > elements in it. In most cases such queries are cheap but,
> > > > on some caches. For example, Android low-memory-killer,
> > > > which is operates as a slab shrinker, does relatively
> > > > long calculation once invoked and it is quite expensive.
> > >
> > > LMK as shrinker is really bad, which everybody didn't want
> > > when we reviewed it a few years ago so that's a one of reason
> > > LMK couldn't be promoted to mainline yet. So your motivation is
> > > already not atrractive. ;-)
> > >
> > > >
> > > > This patch removes redundant queries to shrinker function
> > > > in the loop of shrink batch.
> > >
> > > I didn't review the patch and others don't want it, I guess.
> > > Because slab shrink is under construction and many patches were
> > > already merged into mmtom. Please look at latest mmotm tree.
> > >
> > > git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
> >
> >
> > >
> > > If you concern is still in there and it's really big concern of MM
> > > we should take care, NOT LMK, plese, resend it.
> > >
> > >
> > I've noticed that there are huge changes there in the recent mmotm and you
> > guys already settled the issue of my concern. I usually keep track changes
> > in recent mm-tree, but this time I didn't. My bad :-)
> >
>
> I'm not averse to merging an improvement like this even if it gets
> rubbed out by forthcoming changes. The big changes may never get
> merged or may be reverted. And by merging this patch, others are more
> likely to grab it, backport it into earlier kernels and benefit from
> it.
Fair enough.
>
> Also, the problem which this simple patch fixes might be present in a
> different form after the large patchset has been merged. That does not
> appear to be the case this time.
>
> So I'd actually like to merge Heesub's patch. Problem is, I don't have
> a way to redistribute it for testing - I'd need to effectively revert
> the whole thing when integrating Glauber's stuff on top, so nobody who
> is using linux-next would test Heesub's change. Drat.
True but if you suggest a good reason to review the patch, I will do.
I will reply on his patch mail.
>
>
>
>
> However I'm a bit sceptical about the description here. The shrinker
> is supposed to special-case the "nr_to_scan == 0" case and AFAICT
> drivers/staging/android/lowmemorykiller.c:lowmem_shrink() does do this,
> and it looks like the function will be pretty quick in this case.
>
> In other words, the behaviour of lowmem_shrink(nr_to_scan == 0) does
> not match Heesub's description. What's up with that?
>
>
>
> Also, there is an obvious optimisation which we could make to
> lowmem_shrink(). All this stuff:
>
> if (lowmem_adj_size < array_size)
> array_size = lowmem_adj_size;
> if (lowmem_minfree_size < array_size)
> array_size = lowmem_minfree_size;
> for (i = 0; i < array_size; i++) {
> if (other_free < lowmem_minfree[i] &&
> other_file < lowmem_minfree[i]) {
> min_score_adj = lowmem_adj[i];
> break;
> }
> }
>
> does nothing useful in the nr_to_scan==0 case and should be omitted for
> this special case. But this problem was fixed in the large shrinker
> rework in -mm.
--
Kind regards,
Minchan Kim
Hello,
Andrew want to merge this so I try to review.
On Fri, Jun 14, 2013 at 07:07:51PM +0900, Heesub Shin wrote:
> shrink_slab() queries each slab cache to get the number of
> elements in it. In most cases such queries are cheap but,
> on some caches. For example, Android low-memory-killer,
> which is operates as a slab shrinker, does relatively
> long calculation once invoked and it is quite expensive.
long calculation?
I am looking at lowmem_shrink in v3.9 and I couldn't find
anything which would make slow in case of slab query.
It does have rather unnecessary code with calulating
global vmstat but I think it's not culprit for your slowness.
Could you say how does it makes slow with some number?
If it's true, we have to fix LMK.
We already have been said following as.
* Note that 'shrink' will be passed nr_to_scan == 0 when the VM is
* querying the cache size, so a fastpath for that case is appropriate.
>
> This patch removes redundant queries to shrinker function
> in the loop of shrink batch.
>
> Signed-off-by: Heesub Shin <[email protected]>
> ---
> mm/vmscan.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index fa6a853..11b6695 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -282,9 +282,8 @@ unsigned long shrink_slab(struct shrink_control *shrink,
> max_pass, delta, total_scan);
>
> while (total_scan >= batch_size) {
> - int nr_before;
> + int nr_before = max_pass;
>
> - nr_before = do_shrinker_shrink(shrinker, shrink, 0);
> shrink_ret = do_shrinker_shrink(shrinker, shrink,
> batch_size);
> if (shrink_ret == -1)
> @@ -293,6 +292,7 @@ unsigned long shrink_slab(struct shrink_control *shrink,
> ret += nr_before - shrink_ret;
> count_vm_events(SLABS_SCANNED, batch_size);
> total_scan -= batch_size;
> + max_pass = shrink_ret;
>
> cond_resched();
> }
It might be a good optimization but one of the problem I can see
is cond_resched. If the process is scheduled out and other task
consume more object from the slab, shrink_ret would be obsolete
so that shrink_ret would be greater than nr_before in next iteration.
In such case, we can lose the number of slab objects.
--
Kind regards,
Minchan Kim
On Fri, Jun 14, 2013 at 07:07:51PM +0900, Heesub Shin wrote:
> shrink_slab() queries each slab cache to get the number of
> elements in it. In most cases such queries are cheap but,
> on some caches. For example, Android low-memory-killer,
> which is operates as a slab shrinker, does relatively
> long calculation once invoked and it is quite expensive.
As has already been pointed out, the low memory killer is a badly
broken piece of code. I can't run a normal machine with it enabled
because it randomly kills processes whenever memory pressure is
generated. What it does is simply broken and hence arguing that it
has too much overhead is not a convincing argument for changing core
shrinker infrastructure.
> This patch removes redundant queries to shrinker function
> in the loop of shrink batch.
>
> Signed-off-by: Heesub Shin <[email protected]>
> ---
> mm/vmscan.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index fa6a853..11b6695 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -282,9 +282,8 @@ unsigned long shrink_slab(struct shrink_control *shrink,
> max_pass, delta, total_scan);
>
> while (total_scan >= batch_size) {
> - int nr_before;
> + int nr_before = max_pass;
>
> - nr_before = do_shrinker_shrink(shrinker, shrink, 0);
> shrink_ret = do_shrinker_shrink(shrinker, shrink,
> batch_size);
> if (shrink_ret == -1)
> @@ -293,6 +292,7 @@ unsigned long shrink_slab(struct shrink_control *shrink,
> ret += nr_before - shrink_ret;
> count_vm_events(SLABS_SCANNED, batch_size);
> total_scan -= batch_size;
> + max_pass = shrink_ret;
>
> cond_resched();
> }
Shrinkers run concurrently on different CPUs, and so the state of
the cache being shrunk can change significantly when cond_resched()
actually yields the CPU. Hence we need to recalculate the current
state of the cache before we shrink again to get an accurate idea of
how much work the current loop has done. If we get this badly wrong,
the caller of shrink_slab() will get an incorrect idea of how much
work was actually done by the shrinkers....
This problem is fixed in mmtom by the change of shrinker API that
results shrinker->scan_objects() returning the number of objects
freed directly, and hence it isn't necessary to have a
shrinker->count_objects() call in the scan loop anymore. i.e. the
reworked scan loop ends up like:
while (total_scan >= batch_size) {
unsigned long ret;
shrinkctl->nr_to_scan = batch_size;
ret = shrinker->scan_objects(shrinker, shrinkctl);
if (ret == SHRINK_STOP)
break;
freed += ret;
count_vm_events(SLABS_SCANNED, batch_size);
total_scan -= batch_size;
}
So we've already solved the problem you are concerned about....
Cheers,
Dave.
--
Dave Chinner
[email protected]
Hello,
On 06/17/2013 09:08 AM, Dave Chinner wrote:
> On Fri, Jun 14, 2013 at 07:07:51PM +0900, Heesub Shin wrote:
>> shrink_slab() queries each slab cache to get the number of
>> elements in it. In most cases such queries are cheap but,
>> on some caches. For example, Android low-memory-killer,
>> which is operates as a slab shrinker, does relatively
>> long calculation once invoked and it is quite expensive.
>
> As has already been pointed out, the low memory killer is a badly
> broken piece of code. I can't run a normal machine with it enabled
> because it randomly kills processes whenever memory pressure is
> generated. What it does is simply broken and hence arguing that it
> has too much overhead is not a convincing argument for changing core
> shrinker infrastructure.
>
>> This patch removes redundant queries to shrinker function
>> in the loop of shrink batch.
>>
>> Signed-off-by: Heesub Shin <[email protected]>
>> ---
>> mm/vmscan.c | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>> index fa6a853..11b6695 100644
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -282,9 +282,8 @@ unsigned long shrink_slab(struct shrink_control *shrink,
>> max_pass, delta, total_scan);
>>
>> while (total_scan >= batch_size) {
>> - int nr_before;
>> + int nr_before = max_pass;
>>
>> - nr_before = do_shrinker_shrink(shrinker, shrink, 0);
>> shrink_ret = do_shrinker_shrink(shrinker, shrink,
>> batch_size);
>> if (shrink_ret == -1)
>> @@ -293,6 +292,7 @@ unsigned long shrink_slab(struct shrink_control *shrink,
>> ret += nr_before - shrink_ret;
>> count_vm_events(SLABS_SCANNED, batch_size);
>> total_scan -= batch_size;
>> + max_pass = shrink_ret;
>>
>> cond_resched();
>> }
>
> Shrinkers run concurrently on different CPUs, and so the state of
> the cache being shrunk can change significantly when cond_resched()
> actually yields the CPU. Hence we need to recalculate the current
> state of the cache before we shrink again to get an accurate idea of
> how much work the current loop has done. If we get this badly wrong,
> the caller of shrink_slab() will get an incorrect idea of how much
> work was actually done by the shrinkers....
>
> This problem is fixed in mmtom by the change of shrinker API that
> results shrinker->scan_objects() returning the number of objects
> freed directly, and hence it isn't necessary to have a
> shrinker->count_objects() call in the scan loop anymore. i.e. the
> reworked scan loop ends up like:
>
> while (total_scan >= batch_size) {
> unsigned long ret;
> shrinkctl->nr_to_scan = batch_size;
> ret = shrinker->scan_objects(shrinker, shrinkctl);
>
> if (ret == SHRINK_STOP)
> break;
> freed += ret;
>
> count_vm_events(SLABS_SCANNED, batch_size);
> total_scan -= batch_size;
> }
>
> So we've already solved the problem you are concerned about....
>
> Cheers,
>
> Dave.
>
Thank you for all your comments. I have been keeping up with the mm-list
for a while, but it was my first time having to send out patches and
stuff. I only intended to ask for your reviews and feedbacks. Will make
sure I get over the learning curve until next time around.
Thank you mm guys, Dave, Minchan and Andrew again.
--
Heesub