2009-03-12 12:30:26

by Mikulas Patocka

[permalink] [raw]
Subject: [PATCH] deadlock when swapping to FAT

Hi

swapon will deadlock when an attempt to swap to a file on FAT filesystem
is made. swapon holds i_mutex and FAT bmap is attempting to take it again.

This bug was introduced somewhere between 2.6.27 and 2.6.28.

No other filesystem is taking i_mutex in bmap, so I removed it as well.

Note that there are many other cases where bmap can race with truncate in
almost all the filesystems. They don't impose an immediate threat because
bmap can be only called by root. These problems should be solved in a
generic way, not in individual filesystems.

Mikulas

Signed-off-by: Mikulas Patocka <[email protected]
Cc: <[email protected]>

---
fs/fat/inode.c | 2 --
1 file changed, 2 deletions(-)

Index: linux-2.6.29-rc7-devel/fs/fat/inode.c
===================================================================
--- linux-2.6.29-rc7-devel.orig/fs/fat/inode.c 2009-03-10 21:17:22.000000000 +0100
+++ linux-2.6.29-rc7-devel/fs/fat/inode.c 2009-03-11 22:52:38.000000000 +0100
@@ -202,9 +202,7 @@ static sector_t _fat_bmap(struct address
sector_t blocknr;

/* fat_get_cluster() assumes the requested blocknr isn't truncated. */
- mutex_lock(&mapping->host->i_mutex);
blocknr = generic_block_bmap(mapping, block, fat_get_block);
- mutex_unlock(&mapping->host->i_mutex);

return blocknr;
}


2009-03-12 16:05:32

by OGAWA Hirofumi

[permalink] [raw]
Subject: Re: [PATCH] deadlock when swapping to FAT

Mikulas Patocka <[email protected]> writes:

> swapon will deadlock when an attempt to swap to a file on FAT filesystem
> is made. swapon holds i_mutex and FAT bmap is attempting to take it again.
>
> This bug was introduced somewhere between 2.6.27 and 2.6.28.
>
> No other filesystem is taking i_mutex in bmap, so I removed it as well.
>
> Note that there are many other cases where bmap can race with truncate in
> almost all the filesystems. They don't impose an immediate threat because
> bmap can be only called by root. These problems should be solved in a
> generic way, not in individual filesystems.

Thanks. This was fixed recently with a bit different way.
--
OGAWA Hirofumi <[email protected]>

2009-03-15 01:54:51

by Mikulas Patocka

[permalink] [raw]
Subject: Re: [PATCH] deadlock when swapping to FAT



On Fri, 13 Mar 2009, OGAWA Hirofumi wrote:

> Mikulas Patocka <[email protected]> writes:
>
> > swapon will deadlock when an attempt to swap to a file on FAT filesystem
> > is made. swapon holds i_mutex and FAT bmap is attempting to take it again.
> >
> > This bug was introduced somewhere between 2.6.27 and 2.6.28.
> >
> > No other filesystem is taking i_mutex in bmap, so I removed it as well.
> >
> > Note that there are many other cases where bmap can race with truncate in
> > almost all the filesystems. They don't impose an immediate threat because
> > bmap can be only called by root. These problems should be solved in a
> > generic way, not in individual filesystems.
>
> Thanks. This was fixed recently with a bit different way.
> --
> OGAWA Hirofumi <[email protected]>

Note that the same race condition is happening in all the other
filesystems. Maybe move that i_alloc_sem up to ->bmap method caller?

Mikulas

2009-03-15 03:23:19

by OGAWA Hirofumi

[permalink] [raw]
Subject: Re: [PATCH] deadlock when swapping to FAT

Mikulas Patocka <[email protected]> writes:

> Note that the same race condition is happening in all the other
> filesystems. Maybe move that i_alloc_sem up to ->bmap method caller?

It can be. However, I guess locking strategy would be per
filesystems. Because the fs may be using i_alloc_sem in get_block
already.

Thanks.
--
OGAWA Hirofumi <[email protected]>

2009-03-17 12:23:41

by Mikulas Patocka

[permalink] [raw]
Subject: Re: [PATCH] deadlock when swapping to FAT

On Sun, 15 Mar 2009, OGAWA Hirofumi wrote:

> Mikulas Patocka <[email protected]> writes:
>
> > Note that the same race condition is happening in all the other
> > filesystems. Maybe move that i_alloc_sem up to ->bmap method caller?
>
> It can be. However, I guess locking strategy would be per
> filesystems. Because the fs may be using i_alloc_sem in get_block
> already.

Which ones take it in get_block? I grepped for i_alloc_sem and don't see
them. Besides, it is mostly taken only for read and recursive taking of
read-lock for read is allowed. It is taken for writes only in truncate.

Mikulas

> Thanks.
> --
> OGAWA Hirofumi <[email protected]>
>

2009-03-17 16:09:13

by OGAWA Hirofumi

[permalink] [raw]
Subject: Re: [PATCH] deadlock when swapping to FAT

Mikulas Patocka <[email protected]> writes:

> On Sun, 15 Mar 2009, OGAWA Hirofumi wrote:
>
>> Mikulas Patocka <[email protected]> writes:
>>
>> > Note that the same race condition is happening in all the other
>> > filesystems. Maybe move that i_alloc_sem up to ->bmap method caller?
>>
>> It can be. However, I guess locking strategy would be per
>> filesystems. Because the fs may be using i_alloc_sem in get_block
>> already.
>
> Which ones take it in get_block? I grepped for i_alloc_sem and don't see
> them. Besides, it is mostly taken only for read and recursive taking of
> read-lock for read is allowed. It is taken for writes only in truncate.

I don't know which fs take it, and whether i_alloc_sem is enough for
which fs. It was just guess. And important one is locking strategy of
that would be per filesystems. E.g. it seems XFS is taking own lock.

Well, personally, I don't have objection to add i_alloc_sem, however I'm
not sure, what does i_alloc_sem guarantee for other fs.
--
OGAWA Hirofumi <[email protected]>

2009-03-19 19:34:20

by Mikulas Patocka

[permalink] [raw]
Subject: Re: [PATCH] deadlock when swapping to FAT

On Wed, 18 Mar 2009, OGAWA Hirofumi wrote:

> Mikulas Patocka <[email protected]> writes:
>
> > On Sun, 15 Mar 2009, OGAWA Hirofumi wrote:
> >
> >> Mikulas Patocka <[email protected]> writes:
> >>
> >> > Note that the same race condition is happening in all the other
> >> > filesystems. Maybe move that i_alloc_sem up to ->bmap method caller?
> >>
> >> It can be. However, I guess locking strategy would be per
> >> filesystems. Because the fs may be using i_alloc_sem in get_block
> >> already.
> >
> > Which ones take it in get_block? I grepped for i_alloc_sem and don't see
> > them. Besides, it is mostly taken only for read and recursive taking of
> > read-lock for read is allowed. It is taken for writes only in truncate.
>
> I don't know which fs take it, and whether i_alloc_sem is enough for
> which fs. It was just guess. And important one is locking strategy of
> that would be per filesystems. E.g. it seems XFS is taking own lock.
>
> Well, personally, I don't have objection to add i_alloc_sem, however I'm
> not sure, what does i_alloc_sem guarantee for other fs.

It should prevent truncation under bmap. It is used by direct-io code to
protect the file from being truncated while there's direct-io being
processed on it.

But some filesystems do their own direct-io locking (for example XFS). So
I think it would be best to place the lock to generic_block_bmap, so that
filesystem that doesn't want the lock can easily avoid it.

You can submit this patch after 2.6.29 is released.

Mikulas

> --
> OGAWA Hirofumi <[email protected]>


FAT filesystem used down_read(&mapping->host->i_alloc_sem) to prevent
a race between bmap and truncate. However, such race is present in all the
other filesystems --- it is generally assumed that blocks queried with
get_block won't disappear while get_block is in progress.

The race can be only triggered by root, non-privileged users can't use
bmap, so it is not a security issue (unless there is some program run
by root that bmaps users' files).

This patch fixes the race in a generic way, in all the filesystems. If some
filesystem employs its own locking and doesn't want to take i_alloc_sem
(I don't know about any, where taking i_alloc_sem could be problem),
let it use its own function and not generic_block_bmap.

Signed-off-by: Mikulas Patocka <[email protected]>

---
fs/buffer.c | 8 ++++++++
fs/fat/inode.c | 2 --
2 files changed, 8 insertions(+), 2 deletions(-)

Index: linux-2.6.29-rc8-devel/fs/buffer.c
===================================================================
--- linux-2.6.29-rc8-devel.orig/fs/buffer.c 2009-03-19 15:57:03.000000000 +0100
+++ linux-2.6.29-rc8-devel/fs/buffer.c 2009-03-19 15:58:00.000000000 +0100
@@ -2964,7 +2964,15 @@ sector_t generic_block_bmap(struct addre
tmp.b_state = 0;
tmp.b_blocknr = 0;
tmp.b_size = 1 << inode->i_blkbits;
+
+ /*
+ * Protect the inode from being truncated while get_block is
+ * in progress.
+ */
+ down_read(&mapping->host->i_alloc_sem);
get_block(inode, block, &tmp, 0);
+ up_read(&mapping->host->i_alloc_sem);
+
return tmp.b_blocknr;
}

Index: linux-2.6.29-rc8-devel/fs/fat/inode.c
===================================================================
--- linux-2.6.29-rc8-devel.orig/fs/fat/inode.c 2009-03-19 15:56:50.000000000 +0100
+++ linux-2.6.29-rc8-devel/fs/fat/inode.c 2009-03-19 15:56:58.000000000 +0100
@@ -202,9 +202,7 @@ static sector_t _fat_bmap(struct address
sector_t blocknr;

/* fat_get_cluster() assumes the requested blocknr isn't truncated. */
- down_read(&mapping->host->i_alloc_sem);
blocknr = generic_block_bmap(mapping, block, fat_get_block);
- up_read(&mapping->host->i_alloc_sem);

return blocknr;
}

2009-03-19 22:27:29

by OGAWA Hirofumi

[permalink] [raw]
Subject: Re: [PATCH] deadlock when swapping to FAT

Mikulas Patocka <[email protected]> writes:

> It should prevent truncation under bmap. It is used by direct-io code to
> protect the file from being truncated while there's direct-io being
> processed on it.

I know, FAT is ok, of course, and the simple fs would also be ok.
However, it's not enough for all filesystems, in theory.

> But some filesystems do their own direct-io locking (for example XFS). So
> I think it would be best to place the lock to generic_block_bmap, so that
> filesystem that doesn't want the lock can easily avoid it.
>
> You can submit this patch after 2.6.29 is released.

I don't have objection to it. Please submit it yourself.
--
OGAWA Hirofumi <[email protected]>