2009-04-23 07:40:31

by Chris Wright

[permalink] [raw]
Subject: [patch 035/100] vfs: skip I_CLEAR state inodes

-stable review patch. If anyone has any objections, please let us know.
---------------------

From: Wu Fengguang <[email protected]>

upstream commit: b6fac63cc1f52ec27f29fe6c6c8494a2ffac33fd

clear_inode() will switch inode state from I_FREEING to I_CLEAR, and do so
_outside_ of inode_lock. So any I_FREEING testing is incomplete without a
coupled testing of I_CLEAR.

So add I_CLEAR tests to drop_pagecache_sb(), generic_sync_sb_inodes() and
add_dquot_ref().

Masayoshi MIZUMA discovered the bug in drop_pagecache_sb() and Jan Kara
reminds fixing the other two cases.

Masayoshi MIZUMA has a nice panic flow:

=====================================================================
[process A] | [process B]
| |
| prune_icache() | drop_pagecache()
| spin_lock(&inode_lock) | drop_pagecache_sb()
| inode->i_state |= I_FREEING; | |
| spin_unlock(&inode_lock) | V
| | | spin_lock(&inode_lock)
| V | |
| dispose_list() | |
| list_del() | |
| clear_inode() | |
| inode->i_state = I_CLEAR | |
| | | V
| | | if (inode->i_state & (I_FREEING|I_WILL_FREE))
| | | continue; <==== NOT MATCH
| | |
| | | (DANGER from here on! Accessing disposing inode!)
| | |
| | | __iget()
| | | list_move() <===== PANIC on poisoned list !!
V V |
(time)
=====================================================================

Reported-by: Masayoshi MIZUMA <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Signed-off-by: Wu Fengguang <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
[chrisw: backport to 2.6.29]
Signed-off-by: Chris Wright <[email protected]>
---
fs/dquot.c | 2 +-
fs/drop_caches.c | 2 +-
fs/fs-writeback.c | 3 ++-
3 files changed, 4 insertions(+), 3 deletions(-)

--- a/fs/drop_caches.c
+++ b/fs/drop_caches.c
@@ -18,7 +18,7 @@ static void drop_pagecache_sb(struct sup

spin_lock(&inode_lock);
list_for_each_entry(inode, &sb->s_inodes, i_sb_list) {
- if (inode->i_state & (I_FREEING|I_WILL_FREE))
+ if (inode->i_state & (I_FREEING|I_CLEAR|I_WILL_FREE))
continue;
if (inode->i_mapping->nrpages == 0)
continue;
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -538,7 +538,8 @@ void generic_sync_sb_inodes(struct super
list_for_each_entry(inode, &sb->s_inodes, i_sb_list) {
struct address_space *mapping;

- if (inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW))
+ if (inode->i_state &
+ (I_FREEING|I_CLEAR|I_WILL_FREE|I_NEW))
continue;
mapping = inode->i_mapping;
if (mapping->nrpages == 0)
--- a/fs/dquot.c
+++ b/fs/dquot.c
@@ -793,7 +793,7 @@ static void add_dquot_ref(struct super_b
continue;
if (!dqinit_needed(inode, type))
continue;
- if (inode->i_state & (I_FREEING|I_WILL_FREE))
+ if (inode->i_state & (I_FREEING|I_CLEAR|I_WILL_FREE))
continue;

__iget(inode);