2003-02-25 02:14:51

by Jonah Sherman

[permalink] [raw]
Subject: [OOPS] 2.5.63 - NULL pointer dereference in loop device

Unable to handle kernel NULL pointer dereference at virtual address 00000028
printing eip:
c02298bb
*pde = 00000000
Oops: 0002
CPU: 0
EIP: 0060:[<c02298bb>] Not tainted
EFLAGS: 00010282
EIP is at loop_get_buffer+0x5b/0x70
eax: 00000000 ebx: c0511160 ecx: c0511120 edx: c7fec910
esi: c114a000 edi: c114a110 ebp: c5af7a2c esp: c5af7a18
ds: 007b es: 007b ss: 0068
Process dd (pid: 221, threadinfo=c5af6000 task=c748a6c0)
Stack: c0511160 00000010 00000001 c114a000 00000001 c5af7a50 c0229b06 c114a000
c0511160 c7feee20 00000000 c7ed13c0 00000001 c114a110 c5af7a84 c021d83d
c114a110 c0511160 00000010 00000001 c5af7a6c 00000000 c61fc460 0000513b
Call Trace:
[<c0229b06>] loop_make_request+0xd6/0x170
[<c021d83d>] generic_make_request+0x13d/0x1d0
[<c021d90d>] submit_bio+0x3d/0x80
[<c01529c7>] __block_write_full_page+0x207/0x3e0
[<c0154125>] block_write_full_page+0xd5/0xe0
[<c0156d20>] blkdev_get_block+0x0/0x60
[<c0156e80>] blkdev_writepage+0x20/0x30
[<c0156d20>] blkdev_get_block+0x0/0x60
[<c013e86c>] shrink_list+0x3cc/0x610
[<c0124815>] add_timer+0x95/0xb0
[<c013d679>] __pagevec_release+0x29/0x40
[<c013ec0a>] shrink_cache+0x15a/0x2d0
[<c013f39e>] shrink_caches+0x9e/0xc0
[<c013f435>] try_to_free_pages+0x75/0xe0
[<c0138f28>] __alloc_pages+0x1b8/0x2c0
[<c0137b03>] __grab_cache_page+0x43/0xa1
[<c0137018>] generic_file_aio_write_nolock+0x228/0x7b0
[<c021bd58>] blk_rq_map_sg+0xf8/0x170
[<c02464a9>] __ide_dma_begin+0x39/0x50
[<c0137618>] generic_file_write_nolock+0x78/0x90
[<c012d1d8>] rcu_process_callbacks+0xe8/0x110
[<c01212e6>] tasklet_action+0x46/0x70
[<c010bce8>] do_IRQ+0x108/0x130
[<c010a6e0>] common_interrupt+0x18/0x20
[<c0157ea3>] blkdev_file_write+0x33/0x40
[<c014fb3f>] vfs_write+0xaf/0x120
[<c014fc4c>] sys_write+0x3c/0x60
[<c010a573>] syscall_call+0x7/0xb

Code: c7 40 28 d0 97 22 c0 89 c2 89 58 30 eb b1 8d b4 26 00 00 00


Attachments:
(No filename) (0.00 B)
(No filename) (189.00 B)
Download all attachments

2003-02-25 05:35:15

by Andrew Morton

[permalink] [raw]
Subject: Re: [OOPS] 2.5.63 - NULL pointer dereference in loop device

Jonah Sherman <[email protected]> wrote:
>
> I have come across a bug in the loop driver.

So you have. See, the `dd' fills the machine up with dirty memory and the
loop driver wants to take a copy of all that memory to write it out. It takes
this copy inside page reclaim, where we are allowed to use *all* memory.

So obviously, as there is so much memory to be written, we just run out of
the stuff.

The loop driver has a tendency to do this sort of thing. Whenever it does,
we slap another bandaid on it.


diff -puN drivers/block/loop.c~loop-hack drivers/block/loop.c
--- 25/drivers/block/loop.c~loop-hack 2003-02-24 21:21:11.000000000 -0800
+++ 25-akpm/drivers/block/loop.c 2003-02-24 21:45:13.000000000 -0800
@@ -447,7 +447,22 @@ static struct bio *loop_get_buffer(struc
goto out_bh;
}

- bio = bio_copy(rbh, GFP_NOIO, rbh->bi_rw & WRITE);
+ /*
+ * When called on the page reclaim -> writepage path, this code can
+ * trivially consume all memory. So we drop PF_MEMALLOC to avoid
+ * stealing all the page reserves and throttle to the writeout rate.
+ * pdflush will have been woken by page reclaim. Let it do its work.
+ */
+ do {
+ int flags = current->flags;
+
+ current->flags &= ~PF_MEMALLOC;
+ bio = bio_copy(rbh, (GFP_ATOMIC & ~__GFP_HIGH) | __GFP_NOWARN,
+ rbh->bi_rw & WRITE);
+ current->flags = flags;
+ if (bio == NULL)
+ blk_congestion_wait(WRITE, HZ/10);
+ } while (bio == NULL);

bio->bi_end_io = loop_end_io_transfer;
bio->bi_private = rbh;

_

2003-02-25 21:03:58

by Hugh Dickins

[permalink] [raw]
Subject: Re: [OOPS] 2.5.63 - NULL pointer dereference in loop device

On Mon, 24 Feb 2003, Jonah Sherman wrote:

> I have come across a bug in the loop driver. To reproduce this bug,
> simply do:
> losetup /dev/loop0 /dev/[sh]da(some unused partition)
> dd if=/dev/zero of=/dev/loop0
> If you use the count/bs args for dd, or just wait for it to write alot
> with the above command, you will notice that once it writes more
> than the free amount of RAM, you will get the following oops.
> The strange thing is, this ONLY occurs when attaching the loop device
> to another block device(well, I only tested with an unused partition);
> this bug does NOT occur if the loop target is a regular file on the
> filesystem. I'm a very newbie kernel hacker, so I am not able to create
> a patch to fix this. However, I did trace the oops to
> drivers/block/loop.c:loop_get_buffer(): (kernel 2.5.63, however same
> bug is in 2.5.62, I don't know when it was first introduced)
>
> bio=bio_copy(rbh,GFP_NOIO,rbh->bi_rw & WRITE); // This returns NULL
> bio->bi_end_io = loop_end_io_transfer; // Making this crash

I can't reproduce this, and I don't understand it: please help me!

If you "losetup /dev/loop0 /dev/hdN", then it's LO_FLAGS_BH_REMAP
and doesn't even call bio_copy: it doesn't copy bio or buffers or
pages (unless you have highmem, which you don't mention: then its
pointless wasteful blk_queue_bounce might cause trouble), it's a
straight route through to disk, which should be using mempools
to complete i/o even if the rest of the system is out of memory.

Of course the loop driver is wrong to ignore NULL return from bio_copy
(if you used losetup -e), and there's a lot of unnecessary allocation
and copying and a lot of opportunity for deadlock, for which I have
some perpetually unfinished patches.

But the loop to disk is relatively straightforward, pdflush should
take care of the dirty pages Andrew worries about (though in writing
to blockdev when there's highmem, pdflush may kick in too late); and
I couldn't even reproduce your oops using "-e xor".

Can you shed more light on how to reproduce this?
Thanks,
Hugh

2003-02-26 00:27:28

by Jonah Sherman

[permalink] [raw]
Subject: Re: [OOPS] 2.5.63 - NULL pointer dereference in loop device

On Tue, Feb 25, 2003 at 09:15:56PM +0000, Hugh Dickins wrote:
> If you "losetup /dev/loop0 /dev/hdN", then it's LO_FLAGS_BH_REMAP
> and doesn't even call bio_copy: it doesn't copy bio or buffers or

It appears this way if you just look at none_status, but you didn't look
at loop_init_xfer(). Notice that it doesn't call xfer->init unless
type != 0, so that flag is infact never set.

> pages (unless you have highmem, which you don't mention: then its
> pointless wasteful blk_queue_bounce might cause trouble), it's a
> straight route through to disk, which should be using mempools
> to complete i/o even if the rest of the system is out of memory.

I'm not using highmem.

> Of course the loop driver is wrong to ignore NULL return from bio_copy
> (if you used losetup -e), and there's a lot of unnecessary allocation
> and copying and a lot of opportunity for deadlock, for which I have
> some perpetually unfinished patches.
>
> But the loop to disk is relatively straightforward, pdflush should
> take care of the dirty pages Andrew worries about (though in writing
> to blockdev when there's highmem, pdflush may kick in too late); and
> I couldn't even reproduce your oops using "-e xor".
>
> Can you shed more light on how to reproduce this?

The block dev it is being used on must be larger than your RAM. I don't
have any swap on this machine, so I don't know if it must be bigger than
that too. Maybe disabling swap before testing this oops will make it
work?

In any case, the patch sent by Andrew Morton fixed this bug.


Attachments:
(No filename) (1.50 kB)
(No filename) (189.00 B)
Download all attachments

2003-02-26 10:41:32

by Andrew Morton

[permalink] [raw]
Subject: Re: [OOPS] 2.5.63 - NULL pointer dereference in loop device

Hugh Dickins <[email protected]> wrote:
>
> On Mon, 24 Feb 2003, Jonah Sherman wrote:
>
> > I have come across a bug in the loop driver. To reproduce this bug,
> > simply do:
> ...
> I can't reproduce this, and I don't understand it: please help me!

Well I can't make it happen either now. It went pop first time I tried it
yesterday.

That being said, I can still trigger it by mmapping /dev/loop0 MAP_SHARED and
dirtying it all. That triggers the problematic PF_MEMALLOC path much more easily.

mem=256M
losetup /dev/loop0 /dev/hda5
usemem -m 300 -f /dev/loop0
<oops>

(gdb) p/x lo->lo_flags
$3 = 0x0

Userspace is passing in lo_encrypt_type == 0, so loop_init_xfer() never calls the transfer
init function.

2003-02-26 13:20:45

by Hugh Dickins

[permalink] [raw]
Subject: Re: [OOPS] 2.5.63 - NULL pointer dereference in loop device

On Tue, 25 Feb 2003, Jonah Sherman wrote:
> On Tue, Feb 25, 2003 at 09:15:56PM +0000, Hugh Dickins wrote:
> > If you "losetup /dev/loop0 /dev/hdN", then it's LO_FLAGS_BH_REMAP
> > and doesn't even call bio_copy: it doesn't copy bio or buffers or
>
> It appears this way if you just look at none_status, but you didn't look
> at loop_init_xfer(). Notice that it doesn't call xfer->init unless
> type != 0, so that flag is infact never set.

Hah! Indeed, thank you for setting me straight.

> > Can you shed more light on how to reproduce this?
>
> The block dev it is being used on must be larger than your RAM. I don't
> have any swap on this machine, so I don't know if it must be bigger than
> that too. Maybe disabling swap before testing this oops will make it
> work?

Yes, I'd tried block dev larger than RAM, with and without swap:
no luck. Perhaps my disks somehow keep up with the stuff coming
down from /dev/zero but yours don't? (No disgrace!) But never mind,
Andrew has a surer way to hit it.

> In any case, the patch sent by Andrew Morton fixed this bug.

Andrew's patch for this is a thing of great beauty, a shining beacon
to inspire all whose eyes are graced by it. But I'd still have liked
to understand why it's needed in your case not mine: it's a bit
embarrassing for me to claim I have patches for various loopmem
hangs, yet be unable to reproduce this.

Hugh

2003-02-26 13:43:54

by Hugh Dickins

[permalink] [raw]
Subject: Re: [OOPS] 2.5.63 - NULL pointer dereference in loop device

On Wed, 26 Feb 2003, Andrew Morton wrote:
> Hugh Dickins <[email protected]> wrote:
> > On Mon, 24 Feb 2003, Jonah Sherman wrote:
> >
> > > I have come across a bug in the loop driver. To reproduce this bug,
> > > simply do:
> > ...
> > I can't reproduce this, and I don't understand it: please help me!
>
> Well I can't make it happen either now. It went pop first time I tried it
> yesterday.

Thanks for trying again. Strange.

> That being said, I can still trigger it by mmapping /dev/loop0 MAP_SHARED and
> dirtying it all. That triggers the problematic PF_MEMALLOC path much more easily.
>
> mem=256M
> losetup /dev/loop0 /dev/hda5
> usemem -m 300 -f /dev/loop0
> <oops>

Yes, that's better, I can easily reproduce and understand that one, thanks.

> (gdb) p/x lo->lo_flags
> $3 = 0x0
>
> Userspace is passing in lo_encrypt_type == 0, so loop_init_xfer() never calls the transfer
> init function.

Yes, Jonah enlightened me on that little detail.

I notice that Jari's loop.c, incorporating some or all of Adam's mods,
http://loop-aes.sourceforge.net/updates/2003-02-19/loop-AES/loop.c-2.5.patched.bz2
suffers from neither of these problems: it preallocates, and survives
the usemem; and has no LO_FLAGS_BH_REMAP to deceive me.

Should we reconsider switching to that version of loop.c? You backed
it (or a relative) out of -mm a couple of months back, but given the
changes currently going on, an updated loop.c doesn't seem so wild.

I'd still expect (perhaps unjustly, testing hasn't got that far) it
to suffer from some "cp /dev/zero" problems, but no more than the
present one, since clearly Jari/Adam have given some thought there.

It does look like it's the _maintained_ version of loop.c, so a bit
disheartening to be fixing up problems on the version in the main tree.
(It would also make the recent NFS change for loop unnecessary?)

Hugh