Call cond_resched() from shrink_dentry_list() to preserve
shrink_dcache_parent() interactivity.
void shrink_dcache_parent(struct dentry * parent)
{
while ((found = select_parent(parent, &dispose)) != 0)
shrink_dentry_list(&dispose);
}
select_parent() populates the dispose list with dentries which
shrink_dentry_list() then deletes. select_parent() carefully uses
need_resched() to avoid doing too much work at once. But neither
shrink_dcache_parent() nor its called functions call cond_resched().
So once need_resched() is set select_parent() will return single
dentry dispose list which is then deleted by shrink_dentry_list().
This is inefficient when there are a lot of dentry to process. This
can cause softlockup and hurts interactivity on non preemptable
kernels.
This change adds a call to cond_resched() in shrink_dentry_list().
The primary benefit of this is that need_resched() is quickly cleared
so that future calls to select_parent() are able to efficiently return
a big batch of dentry. A theoretically secondary benefit of this
change is that shrink_dentry_list() is willing to give up the
processor when working on a huge number of dentry.
These additional cond_resched() do not seem to impact performance, at
least for the workload below.
Here is a program which can cause soft lockup on a if other system
activity sets need_resched().
int main()
{
struct rlimit rlim;
int i;
int f[100000];
char buf[20];
struct timeval t1, t2;
double diff;
/* cleanup past run */
system("rm -rf x");
/* boost nfile rlimit */
rlim.rlim_cur = 200000;
rlim.rlim_max = 200000;
if (setrlimit(RLIMIT_NOFILE, &rlim))
err(1, "setrlimit");
/* make directory for files */
if (mkdir("x", 0700))
err(1, "mkdir");
if (gettimeofday(&t1, NULL))
err(1, "gettimeofday");
/* populate directory with open files */
for (i = 0; i < 100000; i++) {
snprintf(buf, sizeof(buf), "x/%d", i);
f[i] = open(buf, O_CREAT);
if (f[i] == -1)
err(1, "open");
}
/* close some of the files */
for (i = 0; i < 85000; i++)
close(f[i]);
/* unlink all files, even open ones */
system("rm -rf x");
if (gettimeofday(&t2, NULL))
err(1, "gettimeofday");
diff = (((double)t2.tv_sec * 1000000 + t2.tv_usec) -
((double)t1.tv_sec * 1000000 + t1.tv_usec));
printf("done: %g elapsed\n", diff/1e6);
return 0;
}
Signed-off-by: Greg Thelen <[email protected]>
---
fs/dcache.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/dcache.c b/fs/dcache.c
index fbfae008..105e973 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -818,6 +818,8 @@ static void shrink_dentry_list(struct list_head *list)
try_prune_one_dentry(dentry);
+ cond_resched();
+
rcu_read_lock();
}
rcu_read_unlock();
--
1.8.1.3
On Mon, Mar 25, 2013 at 10:22:31AM -0700, Greg Thelen wrote:
> Call cond_resched() from shrink_dentry_list() to preserve
> shrink_dcache_parent() interactivity.
>
> void shrink_dcache_parent(struct dentry * parent)
> {
> while ((found = select_parent(parent, &dispose)) != 0)
> shrink_dentry_list(&dispose);
> }
>
> select_parent() populates the dispose list with dentries which
> shrink_dentry_list() then deletes. select_parent() carefully uses
> need_resched() to avoid doing too much work at once. But neither
> shrink_dcache_parent() nor its called functions call cond_resched().
> So once need_resched() is set select_parent() will return single
> dentry dispose list which is then deleted by shrink_dentry_list().
> This is inefficient when there are a lot of dentry to process. This
> can cause softlockup and hurts interactivity on non preemptable
> kernels.
Hi Greg,
I can see how this coul dcause problems, but isn't the problem then
that shrink_dcache_parent()/select_parent() itself is mishandling
the need for rescheduling rather than being a problem with
the shrink_dentry_list() implementation? i.e. select_parent() is
aborting batching based on a need for rescheduling, but then not
doing that itself and assuming that someone else will do the
reschedule for it?
Perhaps this is a better approach:
- while ((found = select_parent(parent, &dispose)) != 0)
+ while ((found = select_parent(parent, &dispose)) != 0) {
shrink_dentry_list(&dispose);
+ cond_resched();
+ }
With this, select_parent() stops batching when a resched is needed,
we dispose of the list as a single batch and only then resched if it
was needed before we go and grab the next batch. That should fix the
"small batch" problem without the potential for changing the
shrink_dentry_list() behaviour adversely for other users....
Cheers,
Dave.
--
Dave Chinner
[email protected]
On Mon, Mar 25 2013, Dave Chinner wrote:
> On Mon, Mar 25, 2013 at 10:22:31AM -0700, Greg Thelen wrote:
>> Call cond_resched() from shrink_dentry_list() to preserve
>> shrink_dcache_parent() interactivity.
>>
>> void shrink_dcache_parent(struct dentry * parent)
>> {
>> while ((found = select_parent(parent, &dispose)) != 0)
>> shrink_dentry_list(&dispose);
>> }
>>
>> select_parent() populates the dispose list with dentries which
>> shrink_dentry_list() then deletes. select_parent() carefully uses
>> need_resched() to avoid doing too much work at once. But neither
>> shrink_dcache_parent() nor its called functions call cond_resched().
>> So once need_resched() is set select_parent() will return single
>> dentry dispose list which is then deleted by shrink_dentry_list().
>> This is inefficient when there are a lot of dentry to process. This
>> can cause softlockup and hurts interactivity on non preemptable
>> kernels.
>
> Hi Greg,
>
> I can see how this coul dcause problems, but isn't the problem then
> that shrink_dcache_parent()/select_parent() itself is mishandling
> the need for rescheduling rather than being a problem with
> the shrink_dentry_list() implementation? i.e. select_parent() is
> aborting batching based on a need for rescheduling, but then not
> doing that itself and assuming that someone else will do the
> reschedule for it?
>
> Perhaps this is a better approach:
>
> - while ((found = select_parent(parent, &dispose)) != 0)
> + while ((found = select_parent(parent, &dispose)) != 0) {
> shrink_dentry_list(&dispose);
> + cond_resched();
> + }
>
> With this, select_parent() stops batching when a resched is needed,
> we dispose of the list as a single batch and only then resched if it
> was needed before we go and grab the next batch. That should fix the
> "small batch" problem without the potential for changing the
> shrink_dentry_list() behaviour adversely for other users....
I considered only modifying shrink_dcache_parent() as you show above.
Either approach fixes the problem I've seen. My initial approach adds
cond_resched() deeper into shrink_dentry_list() because I thought that
there might a secondary benefit: shrink_dentry_list() would be willing
to give up the processor when working on a huge number of dentry. This
could improve interactivity during shrinker and umount. I don't feel
strongly on this and would be willing to test and post the
add-cond_resched-to-shrink_dcache_parent approach.
On Mon, Mar 25, 2013 at 05:39:13PM -0700, Greg Thelen wrote:
> On Mon, Mar 25 2013, Dave Chinner wrote:
> > On Mon, Mar 25, 2013 at 10:22:31AM -0700, Greg Thelen wrote:
> >> Call cond_resched() from shrink_dentry_list() to preserve
> >> shrink_dcache_parent() interactivity.
> >>
> >> void shrink_dcache_parent(struct dentry * parent)
> >> {
> >> while ((found = select_parent(parent, &dispose)) != 0)
> >> shrink_dentry_list(&dispose);
> >> }
> >>
> >> select_parent() populates the dispose list with dentries which
> >> shrink_dentry_list() then deletes. select_parent() carefully uses
> >> need_resched() to avoid doing too much work at once. But neither
> >> shrink_dcache_parent() nor its called functions call cond_resched().
> >> So once need_resched() is set select_parent() will return single
> >> dentry dispose list which is then deleted by shrink_dentry_list().
> >> This is inefficient when there are a lot of dentry to process. This
> >> can cause softlockup and hurts interactivity on non preemptable
> >> kernels.
> >
> > Hi Greg,
> >
> > I can see how this coul dcause problems, but isn't the problem then
> > that shrink_dcache_parent()/select_parent() itself is mishandling
> > the need for rescheduling rather than being a problem with
> > the shrink_dentry_list() implementation? i.e. select_parent() is
> > aborting batching based on a need for rescheduling, but then not
> > doing that itself and assuming that someone else will do the
> > reschedule for it?
> >
> > Perhaps this is a better approach:
> >
> > - while ((found = select_parent(parent, &dispose)) != 0)
> > + while ((found = select_parent(parent, &dispose)) != 0) {
> > shrink_dentry_list(&dispose);
> > + cond_resched();
> > + }
> >
> > With this, select_parent() stops batching when a resched is needed,
> > we dispose of the list as a single batch and only then resched if it
> > was needed before we go and grab the next batch. That should fix the
> > "small batch" problem without the potential for changing the
> > shrink_dentry_list() behaviour adversely for other users....
>
> I considered only modifying shrink_dcache_parent() as you show above.
> Either approach fixes the problem I've seen. My initial approach adds
> cond_resched() deeper into shrink_dentry_list() because I thought that
> there might a secondary benefit: shrink_dentry_list() would be willing
> to give up the processor when working on a huge number of dentry. This
> could improve interactivity during shrinker and umount. I don't feel
> strongly on this and would be willing to test and post the
> add-cond_resched-to-shrink_dcache_parent approach.
The shrinker has interactivity problems because of the global
dcache_lru_lock, not because of ithe size of the list passed to
shrink_dentry_list(). The amount of work that shrink_dentry_list()
does here is already bound by the shrinker batch size. Hence in the
absence of the RT folk complaining about significant holdoffs I
don't think there is an interactivity problem through the shrinker
path.
As for the unmount path - shrink_dcache_for_umount_subtree() - that
doesn't use shrink_dentry_list() and so would need it's own internal
calls to cond_resched(). Perhaps it's shrink_dcache_sb() that you
are concerned about? Either way, And there are lots more similar
issues in the unmount path such as evict_inodes(), so unless you are
going to give every possible path through unmount/remount/bdev
invalidation the same treatment then changing shrink_dentry_list()
won't significantly improve the interactivity of the system
situation in these paths...
Cheers,
Dave.
--
Dave Chinner
[email protected]
On Mon, Mar 25 2013, Dave Chinner wrote:
> On Mon, Mar 25, 2013 at 05:39:13PM -0700, Greg Thelen wrote:
>> On Mon, Mar 25 2013, Dave Chinner wrote:
>> > On Mon, Mar 25, 2013 at 10:22:31AM -0700, Greg Thelen wrote:
>> >> Call cond_resched() from shrink_dentry_list() to preserve
>> >> shrink_dcache_parent() interactivity.
>> >>
>> >> void shrink_dcache_parent(struct dentry * parent)
>> >> {
>> >> while ((found = select_parent(parent, &dispose)) != 0)
>> >> shrink_dentry_list(&dispose);
>> >> }
>> >>
>> >> select_parent() populates the dispose list with dentries which
>> >> shrink_dentry_list() then deletes. select_parent() carefully uses
>> >> need_resched() to avoid doing too much work at once. But neither
>> >> shrink_dcache_parent() nor its called functions call cond_resched().
>> >> So once need_resched() is set select_parent() will return single
>> >> dentry dispose list which is then deleted by shrink_dentry_list().
>> >> This is inefficient when there are a lot of dentry to process. This
>> >> can cause softlockup and hurts interactivity on non preemptable
>> >> kernels.
>> >
>> > Hi Greg,
>> >
>> > I can see how this coul dcause problems, but isn't the problem then
>> > that shrink_dcache_parent()/select_parent() itself is mishandling
>> > the need for rescheduling rather than being a problem with
>> > the shrink_dentry_list() implementation? i.e. select_parent() is
>> > aborting batching based on a need for rescheduling, but then not
>> > doing that itself and assuming that someone else will do the
>> > reschedule for it?
>> >
>> > Perhaps this is a better approach:
>> >
>> > - while ((found = select_parent(parent, &dispose)) != 0)
>> > + while ((found = select_parent(parent, &dispose)) != 0) {
>> > shrink_dentry_list(&dispose);
>> > + cond_resched();
>> > + }
>> >
>> > With this, select_parent() stops batching when a resched is needed,
>> > we dispose of the list as a single batch and only then resched if it
>> > was needed before we go and grab the next batch. That should fix the
>> > "small batch" problem without the potential for changing the
>> > shrink_dentry_list() behaviour adversely for other users....
>>
>> I considered only modifying shrink_dcache_parent() as you show above.
>> Either approach fixes the problem I've seen. My initial approach adds
>> cond_resched() deeper into shrink_dentry_list() because I thought that
>> there might a secondary benefit: shrink_dentry_list() would be willing
>> to give up the processor when working on a huge number of dentry. This
>> could improve interactivity during shrinker and umount. I don't feel
>> strongly on this and would be willing to test and post the
>> add-cond_resched-to-shrink_dcache_parent approach.
>
> The shrinker has interactivity problems because of the global
> dcache_lru_lock, not because of ithe size of the list passed to
> shrink_dentry_list(). The amount of work that shrink_dentry_list()
> does here is already bound by the shrinker batch size. Hence in the
> absence of the RT folk complaining about significant holdoffs I
> don't think there is an interactivity problem through the shrinker
> path.
No arguments from me.
> As for the unmount path - shrink_dcache_for_umount_subtree() - that
> doesn't use shrink_dentry_list() and so would need it's own internal
> calls to cond_resched(). Perhaps it's shrink_dcache_sb() that you
> are concerned about? Either way, And there are lots more similar
> issues in the unmount path such as evict_inodes(), so unless you are
> going to give every possible path through unmount/remount/bdev
> invalidation the same treatment then changing shrink_dentry_list()
> won't significantly improve the interactivity of the system
> situation in these paths...
Ok. As stated, I wasn't sure if the cond_resched() in
shrink_dentry_list() had any appeal. Apparently it doesn't. I'll drop
this approach in favor of the following:
--->8---
From: Greg Thelen <[email protected]>
Date: Sat, 23 Mar 2013 18:25:02 -0700
Subject: [PATCH] vfs: dcache: cond_resched in shrink_dcache_parent
Call cond_resched() in shrink_dcache_parent() to maintain
interactivity.
Before this patch:
void shrink_dcache_parent(struct dentry * parent)
{
while ((found = select_parent(parent, &dispose)) != 0)
shrink_dentry_list(&dispose);
}
select_parent() populates the dispose list with dentries which
shrink_dentry_list() then deletes. select_parent() carefully uses
need_resched() to avoid doing too much work at once. But neither
shrink_dcache_parent() nor its called functions call cond_resched().
So once need_resched() is set select_parent() will return single
dentry dispose list which is then deleted by shrink_dentry_list().
This is inefficient when there are a lot of dentry to process. This
can cause softlockup and hurts interactivity on non preemptable
kernels.
This change adds cond_resched() in shrink_dcache_parent(). The
benefit of this is that need_resched() is quickly cleared so that
future calls to select_parent() are able to efficiently return a big
batch of dentry.
These additional cond_resched() do not seem to impact performance, at
least for the workload below.
Here is a program which can cause soft lockup on a if other system
activity sets need_resched().
int main()
{
struct rlimit rlim;
int i;
int f[100000];
char buf[20];
struct timeval t1, t2;
double diff;
/* cleanup past run */
system("rm -rf x");
/* boost nfile rlimit */
rlim.rlim_cur = 200000;
rlim.rlim_max = 200000;
if (setrlimit(RLIMIT_NOFILE, &rlim))
err(1, "setrlimit");
/* make directory for files */
if (mkdir("x", 0700))
err(1, "mkdir");
if (gettimeofday(&t1, NULL))
err(1, "gettimeofday");
/* populate directory with open files */
for (i = 0; i < 100000; i++) {
snprintf(buf, sizeof(buf), "x/%d", i);
f[i] = open(buf, O_CREAT);
if (f[i] == -1)
err(1, "open");
}
/* close some of the files */
for (i = 0; i < 85000; i++)
close(f[i]);
/* unlink all files, even open ones */
system("rm -rf x");
if (gettimeofday(&t2, NULL))
err(1, "gettimeofday");
diff = (((double)t2.tv_sec * 1000000 + t2.tv_usec) -
((double)t1.tv_sec * 1000000 + t1.tv_usec));
printf("done: %g elapsed\n", diff/1e6);
return 0;
}
Signed-off-by: Greg Thelen <[email protected]>
Signed-off-by: Dave Chinner <[email protected]>
---
fs/dcache.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/dcache.c b/fs/dcache.c
index fbfae008..e52c07e 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -1230,8 +1230,10 @@ void shrink_dcache_parent(struct dentry * parent)
LIST_HEAD(dispose);
int found;
- while ((found = select_parent(parent, &dispose)) != 0)
+ while ((found = select_parent(parent, &dispose)) != 0) {
shrink_dentry_list(&dispose);
+ cond_resched();
+ }
}
EXPORT_SYMBOL(shrink_dcache_parent);
--
1.8.1.3
On Mon, Mar 25 2013, Greg Thelen wrote:
> On Mon, Mar 25 2013, Dave Chinner wrote:
>
>> On Mon, Mar 25, 2013 at 05:39:13PM -0700, Greg Thelen wrote:
>>> On Mon, Mar 25 2013, Dave Chinner wrote:
>>> > On Mon, Mar 25, 2013 at 10:22:31AM -0700, Greg Thelen wrote:
>>> >> Call cond_resched() from shrink_dentry_list() to preserve
>>> >> shrink_dcache_parent() interactivity.
>>> >>
>>> >> void shrink_dcache_parent(struct dentry * parent)
>>> >> {
>>> >> while ((found = select_parent(parent, &dispose)) != 0)
>>> >> shrink_dentry_list(&dispose);
>>> >> }
>>> >>
>>> >> select_parent() populates the dispose list with dentries which
>>> >> shrink_dentry_list() then deletes. select_parent() carefully uses
>>> >> need_resched() to avoid doing too much work at once. But neither
>>> >> shrink_dcache_parent() nor its called functions call cond_resched().
>>> >> So once need_resched() is set select_parent() will return single
>>> >> dentry dispose list which is then deleted by shrink_dentry_list().
>>> >> This is inefficient when there are a lot of dentry to process. This
>>> >> can cause softlockup and hurts interactivity on non preemptable
>>> >> kernels.
>>> >
>>> > Hi Greg,
>>> >
>>> > I can see how this coul dcause problems, but isn't the problem then
>>> > that shrink_dcache_parent()/select_parent() itself is mishandling
>>> > the need for rescheduling rather than being a problem with
>>> > the shrink_dentry_list() implementation? i.e. select_parent() is
>>> > aborting batching based on a need for rescheduling, but then not
>>> > doing that itself and assuming that someone else will do the
>>> > reschedule for it?
>>> >
>>> > Perhaps this is a better approach:
>>> >
>>> > - while ((found = select_parent(parent, &dispose)) != 0)
>>> > + while ((found = select_parent(parent, &dispose)) != 0) {
>>> > shrink_dentry_list(&dispose);
>>> > + cond_resched();
>>> > + }
>>> >
>>> > With this, select_parent() stops batching when a resched is needed,
>>> > we dispose of the list as a single batch and only then resched if it
>>> > was needed before we go and grab the next batch. That should fix the
>>> > "small batch" problem without the potential for changing the
>>> > shrink_dentry_list() behaviour adversely for other users....
>>>
>>> I considered only modifying shrink_dcache_parent() as you show above.
>>> Either approach fixes the problem I've seen. My initial approach adds
>>> cond_resched() deeper into shrink_dentry_list() because I thought that
>>> there might a secondary benefit: shrink_dentry_list() would be willing
>>> to give up the processor when working on a huge number of dentry. This
>>> could improve interactivity during shrinker and umount. I don't feel
>>> strongly on this and would be willing to test and post the
>>> add-cond_resched-to-shrink_dcache_parent approach.
>>
>> The shrinker has interactivity problems because of the global
>> dcache_lru_lock, not because of ithe size of the list passed to
>> shrink_dentry_list(). The amount of work that shrink_dentry_list()
>> does here is already bound by the shrinker batch size. Hence in the
>> absence of the RT folk complaining about significant holdoffs I
>> don't think there is an interactivity problem through the shrinker
>> path.
>
> No arguments from me.
>
>> As for the unmount path - shrink_dcache_for_umount_subtree() - that
>> doesn't use shrink_dentry_list() and so would need it's own internal
>> calls to cond_resched(). Perhaps it's shrink_dcache_sb() that you
>> are concerned about? Either way, And there are lots more similar
>> issues in the unmount path such as evict_inodes(), so unless you are
>> going to give every possible path through unmount/remount/bdev
>> invalidation the same treatment then changing shrink_dentry_list()
>> won't significantly improve the interactivity of the system
>> situation in these paths...
>
> Ok. As stated, I wasn't sure if the cond_resched() in
> shrink_dentry_list() had any appeal. Apparently it doesn't. I'll drop
> this approach in favor of the following:
>
> --->8---
>
> From: Greg Thelen <[email protected]>
> Date: Sat, 23 Mar 2013 18:25:02 -0700
> Subject: [PATCH] vfs: dcache: cond_resched in shrink_dcache_parent
>
> Call cond_resched() in shrink_dcache_parent() to maintain
> interactivity.
>
> Before this patch:
>
> void shrink_dcache_parent(struct dentry * parent)
> {
> while ((found = select_parent(parent, &dispose)) != 0)
> shrink_dentry_list(&dispose);
> }
>
> select_parent() populates the dispose list with dentries which
> shrink_dentry_list() then deletes. select_parent() carefully uses
> need_resched() to avoid doing too much work at once. But neither
> shrink_dcache_parent() nor its called functions call cond_resched().
> So once need_resched() is set select_parent() will return single
> dentry dispose list which is then deleted by shrink_dentry_list().
> This is inefficient when there are a lot of dentry to process. This
> can cause softlockup and hurts interactivity on non preemptable
> kernels.
>
> This change adds cond_resched() in shrink_dcache_parent(). The
> benefit of this is that need_resched() is quickly cleared so that
> future calls to select_parent() are able to efficiently return a big
> batch of dentry.
>
> These additional cond_resched() do not seem to impact performance, at
> least for the workload below.
>
> Here is a program which can cause soft lockup on a if other system
> activity sets need_resched().
>
> int main()
> {
> struct rlimit rlim;
> int i;
> int f[100000];
> char buf[20];
> struct timeval t1, t2;
> double diff;
>
> /* cleanup past run */
> system("rm -rf x");
>
> /* boost nfile rlimit */
> rlim.rlim_cur = 200000;
> rlim.rlim_max = 200000;
> if (setrlimit(RLIMIT_NOFILE, &rlim))
> err(1, "setrlimit");
>
> /* make directory for files */
> if (mkdir("x", 0700))
> err(1, "mkdir");
>
> if (gettimeofday(&t1, NULL))
> err(1, "gettimeofday");
>
> /* populate directory with open files */
> for (i = 0; i < 100000; i++) {
> snprintf(buf, sizeof(buf), "x/%d", i);
> f[i] = open(buf, O_CREAT);
> if (f[i] == -1)
> err(1, "open");
> }
>
> /* close some of the files */
> for (i = 0; i < 85000; i++)
> close(f[i]);
>
> /* unlink all files, even open ones */
> system("rm -rf x");
>
> if (gettimeofday(&t2, NULL))
> err(1, "gettimeofday");
>
> diff = (((double)t2.tv_sec * 1000000 + t2.tv_usec) -
> ((double)t1.tv_sec * 1000000 + t1.tv_usec));
>
> printf("done: %g elapsed\n", diff/1e6);
> return 0;
> }
>
> Signed-off-by: Greg Thelen <[email protected]>
> Signed-off-by: Dave Chinner <[email protected]>
> ---
> fs/dcache.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/fs/dcache.c b/fs/dcache.c
> index fbfae008..e52c07e 100644
> --- a/fs/dcache.c
> +++ b/fs/dcache.c
> @@ -1230,8 +1230,10 @@ void shrink_dcache_parent(struct dentry * parent)
> LIST_HEAD(dispose);
> int found;
>
> - while ((found = select_parent(parent, &dispose)) != 0)
> + while ((found = select_parent(parent, &dispose)) != 0) {
> shrink_dentry_list(&dispose);
> + cond_resched();
> + }
> }
> EXPORT_SYMBOL(shrink_dcache_parent);
Should this change go through Al's or Andrew's branch?
On Tue, 09 Apr 2013 17:37:20 -0700 Greg Thelen <[email protected]> wrote:
> > Call cond_resched() in shrink_dcache_parent() to maintain
> > interactivity.
> >
> > Before this patch:
> >
> > void shrink_dcache_parent(struct dentry * parent)
> > {
> > while ((found = select_parent(parent, &dispose)) != 0)
> > shrink_dentry_list(&dispose);
> > }
> >
> > select_parent() populates the dispose list with dentries which
> > shrink_dentry_list() then deletes. select_parent() carefully uses
> > need_resched() to avoid doing too much work at once. But neither
> > shrink_dcache_parent() nor its called functions call cond_resched().
> > So once need_resched() is set select_parent() will return single
> > dentry dispose list which is then deleted by shrink_dentry_list().
> > This is inefficient when there are a lot of dentry to process. This
> > can cause softlockup and hurts interactivity on non preemptable
> > kernels.
> >
> > This change adds cond_resched() in shrink_dcache_parent(). The
> > benefit of this is that need_resched() is quickly cleared so that
> > future calls to select_parent() are able to efficiently return a big
> > batch of dentry.
> >
> > These additional cond_resched() do not seem to impact performance, at
> > least for the workload below.
> >
> > Here is a program which can cause soft lockup on a if other system
> > activity sets need_resched().
I was unable to guess what word was missing from "on a if other" ;)
> Should this change go through Al's or Andrew's branch?
I'll fight him for it.
Softlockups are fairly serious, so I'll put a cc:stable in there. Or
were the changes which triggered this problem added after 3.9?
On Wed, Apr 10 2013, Andrew Morton wrote:
> On Tue, 09 Apr 2013 17:37:20 -0700 Greg Thelen <[email protected]> wrote:
>
>> > Call cond_resched() in shrink_dcache_parent() to maintain
>> > interactivity.
>> >
>> > Before this patch:
>> >
>> > void shrink_dcache_parent(struct dentry * parent)
>> > {
>> > while ((found = select_parent(parent, &dispose)) != 0)
>> > shrink_dentry_list(&dispose);
>> > }
>> >
>> > select_parent() populates the dispose list with dentries which
>> > shrink_dentry_list() then deletes. select_parent() carefully uses
>> > need_resched() to avoid doing too much work at once. But neither
>> > shrink_dcache_parent() nor its called functions call cond_resched().
>> > So once need_resched() is set select_parent() will return single
>> > dentry dispose list which is then deleted by shrink_dentry_list().
>> > This is inefficient when there are a lot of dentry to process. This
>> > can cause softlockup and hurts interactivity on non preemptable
>> > kernels.
>> >
>> > This change adds cond_resched() in shrink_dcache_parent(). The
>> > benefit of this is that need_resched() is quickly cleared so that
>> > future calls to select_parent() are able to efficiently return a big
>> > batch of dentry.
>> >
>> > These additional cond_resched() do not seem to impact performance, at
>> > least for the workload below.
>> >
>> > Here is a program which can cause soft lockup on a if other system
>> > activity sets need_resched().
>
> I was unable to guess what word was missing from "on a if other" ;)
Less is more ;) Reword to:
Here is a program which can cause soft lockup if other system activity
sets need_resched().
>> Should this change go through Al's or Andrew's branch?
>
> I'll fight him for it.
Thanks.
> Softlockups are fairly serious, so I'll put a cc:stable in there. Or
> were the changes which triggered this problem added after 3.9?
This also applies to stable. I see the problem at least back to v3.3.
I did not test earlier kernels, but could if you want.