2009-09-29 12:22:14

by Suresh Jayaraman

[permalink] [raw]
Subject: [PATCH] swapfile: avoid NULL pointer dereference in swapon when s_bdev is NULL

While testing Swap over NFS patchset, I noticed an oops that was triggered
during swapon. Investigating further, the NULL pointer deference is due to the
SSD device check/optimization in the swapon code that assumes s_bdev could never
be NULL.

inode->i_sb->s_bdev could be NULL in a few cases. For e.g. one such case is
loopback NFS mount, there could be others as well. Fix this by ensuring s_bdev
is not NULL before we try to deference s_bdev.

Signed-off-by: Suresh Jayaraman <[email protected]>
---
mm/swapfile.c | 12 +++++++-----
1 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 4de7f02..a1bc6b9 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1974,12 +1974,14 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
goto bad_swap;
}

- if (blk_queue_nonrot(bdev_get_queue(p->bdev))) {
- p->flags |= SWP_SOLIDSTATE;
- p->cluster_next = 1 + (random32() % p->highest_bit);
+ if (p->bdev) {
+ if (blk_queue_nonrot(bdev_get_queue(p->bdev))) {
+ p->flags |= SWP_SOLIDSTATE;
+ p->cluster_next = 1 + (random32() % p->highest_bit);
+ }
+ if (discard_swap(p) == 0)
+ p->flags |= SWP_DISCARDABLE;
}
- if (discard_swap(p) == 0)
- p->flags |= SWP_DISCARDABLE;

mutex_lock(&swapon_mutex);
spin_lock(&swap_lock);


2009-09-30 06:57:20

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] swapfile: avoid NULL pointer dereference in swapon when s_bdev is NULL

On Tue, Sep 29 2009, Suresh Jayaraman wrote:
> While testing Swap over NFS patchset, I noticed an oops that was triggered
> during swapon. Investigating further, the NULL pointer deference is due to the
> SSD device check/optimization in the swapon code that assumes s_bdev could never
> be NULL.
>
> inode->i_sb->s_bdev could be NULL in a few cases. For e.g. one such case is
> loopback NFS mount, there could be others as well. Fix this by ensuring s_bdev
> is not NULL before we try to deference s_bdev.
>
> Signed-off-by: Suresh Jayaraman <[email protected]>
> ---
> mm/swapfile.c | 12 +++++++-----
> 1 files changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 4de7f02..a1bc6b9 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1974,12 +1974,14 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
> goto bad_swap;
> }
>
> - if (blk_queue_nonrot(bdev_get_queue(p->bdev))) {
> - p->flags |= SWP_SOLIDSTATE;
> - p->cluster_next = 1 + (random32() % p->highest_bit);
> + if (p->bdev) {
> + if (blk_queue_nonrot(bdev_get_queue(p->bdev))) {
> + p->flags |= SWP_SOLIDSTATE;
> + p->cluster_next = 1 + (random32() % p->highest_bit);
> + }
> + if (discard_swap(p) == 0)
> + p->flags |= SWP_DISCARDABLE;
> }
> - if (discard_swap(p) == 0)
> - p->flags |= SWP_DISCARDABLE;
>
> mutex_lock(&swapon_mutex);
> spin_lock(&swap_lock);

Thanks for the patch, looks correct.

--
Jens Axboe

2009-10-01 10:08:11

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH] swapfile: avoid NULL pointer dereference in swapon when s_bdev is NULL

On Tue, 29 Sep 2009, Suresh Jayaraman wrote:

> While testing Swap over NFS patchset, I noticed an oops that was triggered
> during swapon. Investigating further, the NULL pointer deference is due to the
> SSD device check/optimization in the swapon code that assumes s_bdev could never
> be NULL.
>
> inode->i_sb->s_bdev could be NULL in a few cases. For e.g. one such case is
> loopback NFS mount, there could be others as well. Fix this by ensuring s_bdev
> is not NULL before we try to deference s_bdev.
>
> Signed-off-by: Suresh Jayaraman <[email protected]>

Acked-by: Hugh Dickins <[email protected]>

Thanks a lot for that: sorry to say I was ignorant of the possibility.

This is only an issue with an out-of-tree patch, is that correct?
I'd like it to be fixed anyway, but if there's a way in which it can
happen in unpatched 2.6.31, then we ought to send the fix to -stable.

I've added Rafael to the Cc, because CONFIG_HIBERNATION's swap_type_of()
looks also dangerous in this respect - and especially where it does that
"if (bdev == sis->bdev) {" match, I think it's assuming NULL bdev cannot
match against anything.

Hugh

> ---
> mm/swapfile.c | 12 +++++++-----
> 1 files changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 4de7f02..a1bc6b9 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1974,12 +1974,14 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
> goto bad_swap;
> }
>
> - if (blk_queue_nonrot(bdev_get_queue(p->bdev))) {
> - p->flags |= SWP_SOLIDSTATE;
> - p->cluster_next = 1 + (random32() % p->highest_bit);
> + if (p->bdev) {
> + if (blk_queue_nonrot(bdev_get_queue(p->bdev))) {
> + p->flags |= SWP_SOLIDSTATE;
> + p->cluster_next = 1 + (random32() % p->highest_bit);
> + }
> + if (discard_swap(p) == 0)
> + p->flags |= SWP_DISCARDABLE;
> }
> - if (discard_swap(p) == 0)
> - p->flags |= SWP_DISCARDABLE;
>
> mutex_lock(&swapon_mutex);
> spin_lock(&swap_lock);

2009-10-01 11:29:31

by Suresh Jayaraman

[permalink] [raw]
Subject: Re: [PATCH] swapfile: avoid NULL pointer dereference in swapon when s_bdev is NULL

Hugh Dickins wrote:
> On Tue, 29 Sep 2009, Suresh Jayaraman wrote:
>
>> While testing Swap over NFS patchset, I noticed an oops that was triggered
>> during swapon. Investigating further, the NULL pointer deference is due to the
>> SSD device check/optimization in the swapon code that assumes s_bdev could never
>> be NULL.
>>
>> inode->i_sb->s_bdev could be NULL in a few cases. For e.g. one such case is
>> loopback NFS mount, there could be others as well. Fix this by ensuring s_bdev
>> is not NULL before we try to deference s_bdev.
>>
>> Signed-off-by: Suresh Jayaraman <[email protected]>
>
> Acked-by: Hugh Dickins <[email protected]>
>
> Thanks a lot for that: sorry to say I was ignorant of the possibility.
>
> This is only an issue with an out-of-tree patch, is that correct?

Yes, it was reproducible only with an out-of-tree patch.

> I'd like it to be fixed anyway, but if there's a way in which it can
> happen in unpatched 2.6.31, then we ought to send the fix to -stable.
>
> I've added Rafael to the Cc, because CONFIG_HIBERNATION's swap_type_of()
> looks also dangerous in this respect - and especially where it does that
> "if (bdev == sis->bdev) {" match, I think it's assuming NULL bdev cannot
> match against anything.

Yeah, perhaps. I stumbled upon one more of such error - a NULL pointer
dereference in blkdev_issue_discard() called from get_swap_page() when I ran
memhog, a simple program to generate a memory hog with Swap over NFS patches.

The call sequence is add_to_swap() -> get_swap_page() -> scan_swap_map()
-> discard_swap_cluster() -> blkdev_issue_discard().

Wrapping the code around a NULL check fixes the Oops for me.


Signed-off-by: Suresh Jayaraman <[email protected]>
---
mm/swapfile.c | 18 +++++++++++-------
1 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 4de7f02..51f39cf 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -160,11 +160,13 @@ static int discard_swap(struct swap_info_struct *si)
continue;
}

- err = blkdev_issue_discard(si->bdev, start_block,
- nr_blocks, GFP_KERNEL,
- DISCARD_FL_BARRIER);
- if (err)
- break;
+ if (si->bdev) {
+ err = blkdev_issue_discard(si->bdev, start_block,
+ nr_blocks, GFP_KERNEL,
+ DISCARD_FL_BARRIER);
+ if (err)
+ break;
+ }

cond_resched();
}
@@ -200,10 +202,12 @@ static void discard_swap_cluster(struct swap_info_struct *si,

start_block <<= PAGE_SHIFT - 9;
nr_blocks <<= PAGE_SHIFT - 9;
- if (blkdev_issue_discard(si->bdev, start_block,
+ if (si->bdev) {
+ if (blkdev_issue_discard(si->bdev, start_block,
nr_blocks, GFP_NOIO,
DISCARD_FL_BARRIER))
- break;
+ break;
+ }
}

lh = se->list.next;

2009-10-01 11:53:21

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH] swapfile: avoid NULL pointer dereference in swapon when s_bdev is NULL

On Thu, 1 Oct 2009, Suresh Jayaraman wrote:
>
> Yeah, perhaps. I stumbled upon one more of such error - a NULL pointer
> dereference in blkdev_issue_discard() called from get_swap_page() when I ran
> memhog, a simple program to generate a memory hog with Swap over NFS patches.
>
> The call sequence is add_to_swap() -> get_swap_page() -> scan_swap_map()
> -> discard_swap_cluster() -> blkdev_issue_discard().
>
> Wrapping the code around a NULL check fixes the Oops for me.

That's odd: scan_swap_map() should only discard_swap_cluster() if
SWP_DISCARDABLE got set, and your first patch made sure that it wasn't.

So I don't think this second patch should be necessary: you did have
your first applied when you found this?

I wonder if there's a funny little issue like si->lowest_alloc not
being reset to 0 where it should be. Were you switching between
NFS swap and SSD swap in your testing?

Hugh

2009-10-01 12:32:44

by Suresh Jayaraman

[permalink] [raw]
Subject: Re: [PATCH] swapfile: avoid NULL pointer dereference in swapon when s_bdev is NULL

Hugh Dickins wrote:
> On Thu, 1 Oct 2009, Suresh Jayaraman wrote:
>> Yeah, perhaps. I stumbled upon one more of such error - a NULL pointer
>> dereference in blkdev_issue_discard() called from get_swap_page() when I ran
>> memhog, a simple program to generate a memory hog with Swap over NFS patches.
>>
>> The call sequence is add_to_swap() -> get_swap_page() -> scan_swap_map()
>> -> discard_swap_cluster() -> blkdev_issue_discard().
>>
>> Wrapping the code around a NULL check fixes the Oops for me.
>
> That's odd: scan_swap_map() should only discard_swap_cluster() if
> SWP_DISCARDABLE got set, and your first patch made sure that it wasn't.

I forgot to mention, this is not on loopback NFS mount but an remote NFS
mount (so possibly s_bdev is not NULL) when doing swapon. The oops was
triggered when memhog program tries to use the swap space on the newly
created swapfile on NFS. I have not completely investigated the issue,
perhaps s_bdev is not being set when it ought to be..

> So I don't think this second patch should be necessary: you did have
> your first applied when you found this?

yes, I had the first patch applied when it oopsed and I don't use SSD at
all.

> I wonder if there's a funny little issue like si->lowest_alloc not
> being reset to 0 where it should be. Were you switching between
> NFS swap and SSD swap in your testing?

No.

Thanks,

--
Suresh Jayaraman

2009-10-06 21:04:34

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH] swapfile: avoid NULL pointer dereference in swapon when s_bdev is NULL

On Thu, 1 Oct 2009, Suresh Jayaraman wrote:
> Hugh Dickins wrote:
> > On Thu, 1 Oct 2009, Suresh Jayaraman wrote:
> >>
> >> The call sequence is add_to_swap() -> get_swap_page() -> scan_swap_map()
> >> -> discard_swap_cluster() -> blkdev_issue_discard().
> >>
> >> Wrapping the code around a NULL check fixes the Oops for me.
> >
> > That's odd: scan_swap_map() should only discard_swap_cluster() if
> > SWP_DISCARDABLE got set, and your first patch made sure that it wasn't.
>
> I forgot to mention, this is not on loopback NFS mount but an remote NFS
> mount (so possibly s_bdev is not NULL) when doing swapon. The oops was
> triggered when memhog program tries to use the swap space on the newly
> created swapfile on NFS. I have not completely investigated the issue,
> perhaps s_bdev is not being set when it ought to be..

I'm happy to see your first patch already in 2.6.32-rc3, but still
suspicious of this second patch you sent afterwards. A quick skim
through your patchset suggests 23/31 is probably responsible:

--- mmotm.orig/include/linux/swap.h
+++ mmotm/include/linux/swap.h
@@ -120,6 +120,7 @@ struct swap_extent {
enum {
SWP_USED = (1 << 0), /* is slot in swap_info[] used? */
SWP_WRITEOK = (1 << 1), /* ok to write to this swap? */
+ SWP_FILE = (1 << 2), /* file swap area */
SWP_DISCARDABLE = (1 << 2), /* blkdev supports discard */
SWP_DISCARDING = (1 << 3), /* now discarding a free cluster */
SWP_SOLIDSTATE = (1 << 4), /* blkdev seeks are cheap */

Hugh

2009-10-07 03:54:28

by Suresh Jayaraman

[permalink] [raw]
Subject: Re: [PATCH] swapfile: avoid NULL pointer dereference in swapon when s_bdev is NULL

Hugh Dickins wrote:
> On Thu, 1 Oct 2009, Suresh Jayaraman wrote:
>> Hugh Dickins wrote:
>>> On Thu, 1 Oct 2009, Suresh Jayaraman wrote:
>>>> The call sequence is add_to_swap() -> get_swap_page() -> scan_swap_map()
>>>> -> discard_swap_cluster() -> blkdev_issue_discard().
>>>>
>>>> Wrapping the code around a NULL check fixes the Oops for me.
>>> That's odd: scan_swap_map() should only discard_swap_cluster() if
>>> SWP_DISCARDABLE got set, and your first patch made sure that it wasn't.
>> I forgot to mention, this is not on loopback NFS mount but an remote NFS
>> mount (so possibly s_bdev is not NULL) when doing swapon. The oops was
>> triggered when memhog program tries to use the swap space on the newly
>> created swapfile on NFS. I have not completely investigated the issue,
>> perhaps s_bdev is not being set when it ought to be..
>
> I'm happy to see your first patch already in 2.6.32-rc3, but still
> suspicious of this second patch you sent afterwards. A quick skim
> through your patchset suggests 23/31 is probably responsible:

Thanks for skimming through those patches. I noticed this already during
my review and thought I had already fixed this, but apparently I missed
out. I'll test after fixing this and report.

> --- mmotm.orig/include/linux/swap.h
> +++ mmotm/include/linux/swap.h
> @@ -120,6 +120,7 @@ struct swap_extent {
> enum {
> SWP_USED = (1 << 0), /* is slot in swap_info[] used? */
> SWP_WRITEOK = (1 << 1), /* ok to write to this swap? */
> + SWP_FILE = (1 << 2), /* file swap area */
> SWP_DISCARDABLE = (1 << 2), /* blkdev supports discard */
> SWP_DISCARDING = (1 << 3), /* now discarding a free cluster */
> SWP_SOLIDSTATE = (1 << 4), /* blkdev seeks are cheap */
>
> Hugh

Thanks,

--
Suresh Jayaraman

2009-10-21 12:42:03

by Suresh Jayaraman

[permalink] [raw]
Subject: Re: [PATCH] swapfile: avoid NULL pointer dereference in swapon when s_bdev is NULL

Suresh Jayaraman wrote:
> Hugh Dickins wrote:
>> On Thu, 1 Oct 2009, Suresh Jayaraman wrote:
>>> Hugh Dickins wrote:
>>>> On Thu, 1 Oct 2009, Suresh Jayaraman wrote:
>>>>> The call sequence is add_to_swap() -> get_swap_page() -> scan_swap_map()
>>>>> -> discard_swap_cluster() -> blkdev_issue_discard().
>>>>>
>>>>> Wrapping the code around a NULL check fixes the Oops for me.
>>>> That's odd: scan_swap_map() should only discard_swap_cluster() if
>>>> SWP_DISCARDABLE got set, and your first patch made sure that it wasn't.
>>> I forgot to mention, this is not on loopback NFS mount but an remote NFS
>>> mount (so possibly s_bdev is not NULL) when doing swapon. The oops was
>>> triggered when memhog program tries to use the swap space on the newly
>>> created swapfile on NFS. I have not completely investigated the issue,
>>> perhaps s_bdev is not being set when it ought to be..
>> I'm happy to see your first patch already in 2.6.32-rc3, but still
>> suspicious of this second patch you sent afterwards. A quick skim
>> through your patchset suggests 23/31 is probably responsible:
>
> Thanks for skimming through those patches. I noticed this already during
> my review and thought I had already fixed this, but apparently I missed
> out. I'll test after fixing this and report.

Sorry about the delay. I tested by fixing SWP_FILE macro only now after
returning from vacation. I was not able to reproduce this. Please ignore
the second patch and it's no longer necessary as noted by Hugh already.

Thanks,


>> --- mmotm.orig/include/linux/swap.h
>> +++ mmotm/include/linux/swap.h
>> @@ -120,6 +120,7 @@ struct swap_extent {
>> enum {
>> SWP_USED = (1 << 0), /* is slot in swap_info[] used? */
>> SWP_WRITEOK = (1 << 1), /* ok to write to this swap? */
>> + SWP_FILE = (1 << 2), /* file swap area */
>> SWP_DISCARDABLE = (1 << 2), /* blkdev supports discard */
>> SWP_DISCARDING = (1 << 3), /* now discarding a free cluster */
>> SWP_SOLIDSTATE = (1 << 4), /* blkdev seeks are cheap */
>>
>> Hugh
>
> Thanks,
>


--
Suresh Jayaraman