2015-04-01 08:05:41

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH 8/8] inode: don't softlockup when evicting inodes

Sorry for a late reply. I was ill last week...

On Fri 20-03-15 13:14:16, Josef Bacik wrote:
> On a box with a lot of ram (148gb) I can make the box softlockup after running
> an fs_mark job that creates hundreds of millions of empty files. This is
> because we never generate enough memory pressure to keep the number of inodes on
> our unused list low, so when we go to unmount we have to evict ~100 million
> inodes. This makes one processor a very unhappy person, so add a cond_resched()
> in dispose_list() and cond_resched_lock() in the eviction isolation function to
> combat this. Thanks,
>
> Signed-off-by: Josef Bacik <[email protected]>
> ---
> fs/inode.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/fs/inode.c b/fs/inode.c
> index b961e5a..c58dbd3 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -574,6 +574,7 @@ static void dispose_list(struct list_head *head)
> list_del_init(&inode->i_lru);
>
> evict(inode);
> + cond_resched();
Fine.

> }
> }
>
> @@ -592,6 +593,7 @@ void evict_inodes(struct super_block *sb)
> LIST_HEAD(dispose);
>
> spin_lock(&sb->s_inode_list_lock);
> +again:
> list_for_each_entry_safe(inode, next, &sb->s_inodes, i_sb_list) {
> if (atomic_read(&inode->i_count))
> continue;
> @@ -606,6 +608,14 @@ void evict_inodes(struct super_block *sb)
> inode_lru_list_del(inode);
> spin_unlock(&inode->i_lock);
> list_add(&inode->i_lru, &dispose);
> +
> + /*
> + * We can have a ton of inodes to evict at unmount time given
> + * enough memory, check to see if we need to go to sleep for a
> + * bit so we don't livelock.
> + */
> + if (cond_resched_lock(&sb->s_inode_list_lock))
> + goto again;
Not so fine. How this is ever guaranteed to finish? We don't move inodes
from the i_sb_list in this loop so if we ever take 'goto again' we just
start doing all the work from the beginning...

What needs to happen is that if we need to resched, we drop
sb->s_inode_list_lock, call dispose_list(&dispose) and *then* restart from
the beginning since we have freed all the inodes that we isolated...

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR


2015-04-07 15:03:56

by Josef Bacik

[permalink] [raw]
Subject: Re: [PATCH 8/8] inode: don't softlockup when evicting inodes

On 04/01/2015 04:05 AM, Jan Kara wrote:
> Sorry for a late reply. I was ill last week...
>

That's ok, I was on vacation for the last two weeks ;).

> On Fri 20-03-15 13:14:16, Josef Bacik wrote:
>> On a box with a lot of ram (148gb) I can make the box softlockup after running
>> an fs_mark job that creates hundreds of millions of empty files. This is
>> because we never generate enough memory pressure to keep the number of inodes on
>> our unused list low, so when we go to unmount we have to evict ~100 million
>> inodes. This makes one processor a very unhappy person, so add a cond_resched()
>> in dispose_list() and cond_resched_lock() in the eviction isolation function to
>> combat this. Thanks,
>>
>> Signed-off-by: Josef Bacik <[email protected]>
>> ---
>> fs/inode.c | 10 ++++++++++
>> 1 file changed, 10 insertions(+)
>>
>> diff --git a/fs/inode.c b/fs/inode.c
>> index b961e5a..c58dbd3 100644
>> --- a/fs/inode.c
>> +++ b/fs/inode.c
>> @@ -574,6 +574,7 @@ static void dispose_list(struct list_head *head)
>> list_del_init(&inode->i_lru);
>>
>> evict(inode);
>> + cond_resched();
> Fine.
>
>> }
>> }
>>
>> @@ -592,6 +593,7 @@ void evict_inodes(struct super_block *sb)
>> LIST_HEAD(dispose);
>>
>> spin_lock(&sb->s_inode_list_lock);
>> +again:
>> list_for_each_entry_safe(inode, next, &sb->s_inodes, i_sb_list) {
>> if (atomic_read(&inode->i_count))
>> continue;
>> @@ -606,6 +608,14 @@ void evict_inodes(struct super_block *sb)
>> inode_lru_list_del(inode);
>> spin_unlock(&inode->i_lock);
>> list_add(&inode->i_lru, &dispose);
>> +
>> + /*
>> + * We can have a ton of inodes to evict at unmount time given
>> + * enough memory, check to see if we need to go to sleep for a
>> + * bit so we don't livelock.
>> + */
>> + if (cond_resched_lock(&sb->s_inode_list_lock))
>> + goto again;
> Not so fine. How this is ever guaranteed to finish? We don't move inodes
> from the i_sb_list in this loop so if we ever take 'goto again' we just
> start doing all the work from the beginning...
>
> What needs to happen is that if we need to resched, we drop
> sb->s_inode_list_lock, call dispose_list(&dispose) and *then* restart from
> the beginning since we have freed all the inodes that we isolated...
>

Ooops, good point. I'll get this fixed up, thanks,

Josef