Hi Al,
I am currently looking into a customer dump and found what looks like
an issue in the dcache code. And I think the following commit of yours
has something to do with it:
commit fe91522a7ba82ca1a51b07e19954b3825e4aaa22
Author: Al Viro <[email protected]>
Date: Sat May 3 00:02:25 2014 -0400
don't remove from shrink list in select_collect()
If we find something already on a shrink list, just increment
data->found and do nothing else. Loops in shrink_dcache_parent() and
check_submounts_and_drop() will do the right thing - everything we
did put into our list will be evicted and if there had been nothing,
but data->found got non-zero, well, we have somebody else shrinking
those guys; just try again.
Signed-off-by: Al Viro <[email protected]>
The dump I got is based on kernel v4.4 but the affected dcache functions
look identical to the upstream version. Here is what I found in the dump:
A lot of "rcu_sched kthread starved for <xxx> jiffies!" messages
Only one CPU, currently running process "run-crons" task 0x65a8008
It just called check_and_drop from d_walk, full backchain:
PSW.addr check_and_drop at 30a0e8
%r14 d_walk at 308202
#0 [35b87b88] d_invalidate at 3096e8
#1 [35b87bd8] proc_flush_task at 37190c
#2 [35b87c58] release_task at 13f202
#3 [35b87cc8] wait_task_zombie at 13fc36
#4 [35b87d50] wait_consider_task at 140150
#5 [35b87dc0] do_wait at 1403de
#6 [35b87e18] sys_wait4 at 14181e
#7 [35b87ea8] system_call at 659ec4
Tasks runtime is
sum_exec_runtime 26813717162347 # nsec = 26813 seconds,
utime = 3991252 # cputime = 974 seconds,
stime = 99132516783832 # cputime = 24202 seconds,
Task 0x65a8008 has TIF_NEED_RESCHED set
d_walk() just called check_and_drop via the finish() function pointer,
check_and_drop() will return and d_walk() will return as well.
Look like an endless loop in d_invalidate().
The (struct dentry *) dentry in d_invalidate() is at 0x3cb15858
The struct detach_data data in d_invalidate() is at 0x35b87c28
dentry tree starting @ 0x3cb15858 has two entries in d_subdirs:
0x3cb15858 d_name.name: "11898"
0xb940d3d8 d_name.name: "cmdline"
0xb940dd98 d_name.name: "status"
crash> px *(struct dentry *) 0x3cb15858 | grep d_flags
d_flags = 0x2000cc,
crash> px *(struct dentry *) 0xb940d3d8 | grep d_flags
d_flags = 0x48048c, # DCACHE_SHRINK_LIST is set
crash> px *(struct dentry *) 0xb940dd98 | grep d_flags
d_flags = 0x48048c, # DCACHE_SHRINK_LIST is set
crash> px *(struct detach_data *) 0x35b87c28
$29 = {
select = {
start = 0x3cb15858,
dispose = {
next = 0x35b87c30,
prev = 0x35b87c30
},
found = 0x2
},
mountpoint = 0x0
}
select_collect() called from detach_and_collect() will increment
data.select.found in the struct detach_data @ 0x35b87c28 but will not
add any dentries to the dispose lists. The shrink_dentry_list() call in
d_invalidate() will do nothing as the dispose list is empty. The two
dentries 0xb940d3d8 and 0xb940dd98 are still there. After d_walk returns
d_invalidate() finds data.mountpoint == NULL and data.select.found == 2,
it will start the loop again without progress.
As this is a single CPU system without kernel preemption there is nobody
else that will do the shrinking of those dcache entries.
In short, this if-statement in select_collect:
if (dentry->d_flags & DCACHE_SHRINK_LIST) {
data->found++;
}
with assumption that "somebody else" will do the shrinking seems broken.
Do you agree?
--
blue skies,
Martin.
"Reality continues to ruin my life." - Calvin.
On Tue, 16 Oct 2018 13:15:28 +0200
Martin Schwidefsky <[email protected]> wrote:
> In short, this if-statement in select_collect:
>
> if (dentry->d_flags & DCACHE_SHRINK_LIST) {
> data->found++;
> }
>
> with assumption that "somebody else" will do the shrinking seems broken.
>
> Do you agree?
If I am not mistaken this problem should be fixed by upstream commit
4fb4887140 "restore cond_resched() in shrink_dcache_parent()"
which goes on top of
ff17fa561a "d_invalidate(): unhash immediately"
Due to the cond_resched() the task that set DCACHE_SHRINK_LIST for the
remaining two dcache entries will be scheduled eventually. This will
allow the task waiting for the deletion of these dcache entries
to continue, although some CPU cycles may get wasted.
--
blue skies,
Martin.
"Reality continues to ruin my life." - Calvin.