2012-10-02 17:58:32

by Shuah Khan

[permalink] [raw]
Subject: kernel NULL pointer dereference at rb_erase+0x1a3/0x370

I started seeing the following null pointer dereference on
a linux-next sept 21 git and still seeing it on linux-next
Sep 27th git.

Can be reproduced easily. I have been able to reproduce every
time I do a complete build of a kernel on fresh checkout or
touch a header file that forces full build.

Related to lib/rbtree.c commits that went into September 21 perhaps.
I didn't get a chance to investigate this yet, thought I would share
just in case others have seen it.


[ 346.676805] audit_printk_skb: 24 callbacks suppressed
[ 346.676814] type=1400 audit(1349010919.383:28): apparmor="DENIED" operation="capable" parent=1 profile="/usr/sbin/cupsd" pid=905 comm="cupsd" pid=905 comm="cupsd" capability=36 capname="block_suspend"
[ 2219.660124] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 2219.660182] IP: [<ffffffff81323c93>] rb_erase+0x1a3/0x370
[ 2219.660209] PGD 73f2f067 PUD 67246067 PMD 0
[ 2219.660235] Oops: 0000 [#1] SMP
[ 2219.660257] Modules linked in: bnep arc4 iwldvm rfcomm bluetooth mac80211 coretemp radeon kvm_intel kvm snd_hda_codec_analog snd_hda_intel snd_hda_codec iwlwifi snd_hwdep snd_pcm snd_seq_midi snd_rawmidi ttm drm_kms_helper snd_seq_midi_event snd_seq drm pata_pcmcia cfg80211 psmouse snd_timer binfmt_misc pcmcia tpm_infineon hp_wmi snd_seq_device snd sparse_keymap joydev ppdev yenta_socket microcode hp_accel lis3lv02d soundcore wmi i2c_algo_bit parport_pc serio_raw pcmcia_rsrc pcmcia_core tpm_tis input_polldev snd_page_alloc lpc_ich mac_hid video lp parport firewire_ohci firewire_core crc_itu_t sdhci_pci sdhci e1000e
[ 2219.660632] CPU 0
[ 2219.660644] Pid: 1660, comm: recordmcount Not tainted 3.6.0-rc6-next-20120921+ #4 Hewlett-Packard HP EliteBook 6930p/30DC
[ 2219.660684] RIP: 0010:[<ffffffff81323c93>] [<ffffffff81323c93>] rb_erase+0x1a3/0x370
[ 2219.660714] RSP: 0018:ffff880073f318e8 EFLAGS: 00010246
[ 2219.662042] RAX: ffff88007911ea80 RBX: 000000000000003b RCX: 0000000000000000
[ 2219.663345] RDX: 0000000000000000 RSI: ffff88007a4e1c08 RDI: 0000000000000001
[ 2219.664007] RBP: ffff880073f318e8 R08: ffff88007911ea40 R09: 0000000000000000
[ 2219.664007] R10: ffff88007911ea00 R11: 0000000000000018 R12: ffff88007a4e19a0
[ 2219.664007] R13: 0000000000000001 R14: 0000000000000001 R15: 000000000000003c
[ 2219.664007] FS: 00002b380a82bb40(0000) GS:ffff88007fa00000(0000) knlGS:0000000000000000
[ 2219.664007] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2219.664007] CR2: 0000000000000000 CR3: 0000000079428000 CR4: 00000000000407f0
[ 2219.664007] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2219.664007] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 2219.664007] Process recordmcount (pid: 1660, threadinfo ffff880073f30000, task ffff88002fe9ada0)
[ 2219.664007] Stack:
[ 2219.664007] ffff880073f31948 ffffffff8125af6e ffff88007911eaa0 ffff88007911ea80
[ 2219.664007] ffff88002fe7c400 ffff88002fe7c400 0000000000000002 ffff88007a4e19a0
[ 2219.664007] ffff88007aa4e6e8 ffff88002fe7c400 0000000000000001 000000000000003b
[ 2219.664007] Call Trace:
[ 2219.664007] [<ffffffff8125af6e>] ext4_es_insert_extent+0x28e/0x2f0
[ 2219.664007] [<ffffffff81219e1d>] ext4_da_get_block_prep+0x11d/0x3b0
[ 2219.664007] [<ffffffff811afa63>] ? alloc_buffer_head+0x43/0x50
[ 2219.664007] [<ffffffff811afbde>] ? alloc_page_buffers+0x7e/0xf0
[ 2219.664007] [<ffffffff811b218e>] __block_write_begin+0x1ce/0x520
[ 2219.664007] [<ffffffff81219d00>] ? do_journal_get_write_access+0xb0/0xb0
[ 2219.664007] [<ffffffff81126b8f>] ? grab_cache_page_write_begin+0x8f/0xf0
[ 2219.664007] [<ffffffff8121d778>] ext4_da_write_begin+0xc8/0x210
[ 2219.664007] [<ffffffff81126082>] generic_file_buffered_write+0x112/0x290
[ 2219.664007] [<ffffffff81127826>] __generic_file_aio_write+0x1b6/0x3b0
[ 2219.664007] [<ffffffff81127a9f>] generic_file_aio_write+0x7f/0x100
[ 2219.664007] [<ffffffff81216640>] ext4_file_write+0xa0/0x460
[ 2219.664007] [<ffffffff8104a787>] ? pte_alloc_one+0x37/0x50
[ 2219.664007] [<ffffffff8167f3be>] ? _raw_spin_lock+0xe/0x20
[ 2219.664007] [<ffffffff8114c269>] ? __pte_alloc+0xa9/0x160
[ 2219.664007] [<ffffffff8117eed3>] do_sync_write+0xa3/0xe0
[ 2219.664007] [<ffffffff8117f563>] vfs_write+0xb3/0x180
[ 2219.664007] [<ffffffff8117f88a>] sys_write+0x4a/0x90
[ 2219.664007] [<ffffffff8168375e>] ? do_page_fault+0xe/0x10
[ 2219.664007] [<ffffffff81687da9>] system_call_fastpath+0x16/0x1b
[ 2219.664007] Code: 10 f6 c2 01 0f 84 4e 01 00 00 48 83 e2 fc 0f 84 10 ff ff ff 48 89 c1 48 89 d0 48 8b 50 08 48 39 ca 0f 85 71 ff ff ff 48 8b 50 10 <f6> 02 01 75 3a 48 8b 7a 08 48 89 c1 48 83 c9 01 48 89 78 10 48
[ 2219.664007] RIP [<ffffffff81323c93>] rb_erase+0x1a3/0x370
[ 2219.664007] RSP <ffff880073f318e8>
[ 2219.664007] CR2: 0000000000000000
[ 2219.724809] ---[ end trace dc0112a541a4c9dc ]---



2012-10-02 20:28:40

by Hugh Dickins

[permalink] [raw]
Subject: Re: kernel NULL pointer dereference at rb_erase+0x1a3/0x370

On Tue, 2 Oct 2012, Shuah Khan wrote:
> I started seeing the following null pointer dereference on
> a linux-next sept 21 git and still seeing it on linux-next
> Sep 27th git.
>
> Can be reproduced easily. I have been able to reproduce every
> time I do a complete build of a kernel on fresh checkout or
> touch a header file that forces full build.
>
> Related to lib/rbtree.c commits that went into September 21 perhaps.

That's a sensible suggestion, but I think probably not.

> I didn't get a chance to investigate this yet, thought I would share
> just in case others have seen it.
>
>
> [ 2219.660124] BUG: unable to handle kernel NULL pointer dereference at (null)
> [ 2219.660182] IP: [<ffffffff81323c93>] rb_erase+0x1a3/0x370
> [ 2219.660209] PGD 73f2f067 PUD 67246067 PMD 0
> [ 2219.660235] Oops: 0000 [#1] SMP
> [ 2219.660632] CPU 0
> [ 2219.660644] Pid: 1660, comm: recordmcount Not tainted 3.6.0-rc6-next-20120921+ #4 Hewlett-Packard HP EliteBook 6930p/30DC
> [ 2219.664007] Process recordmcount (pid: 1660, threadinfo ffff880073f30000, task ffff88002fe9ada0)
> [ 2219.664007] Call Trace:
> [ 2219.664007] [<ffffffff8125af6e>] ext4_es_insert_extent+0x28e/0x2f0

ext4_es_insert_extent: this is probably something I posted a fix for on
the 27th (copied below). They're reworking that series more thoroughly,
and it has for the moment dropped out of linux-next, so I expect you
won't see this error at all with a newer next. But you may want to
continue with your Sep 27th tree, and try out this patch anyway...

[PATCH next/mmotm] ext4: fix cache_es after merge_left

Kernel build with CONFIG_DEBUG_SLAB or CONFIG_SLUB_DEBUG slub_debug=FPZ
gives me kernel BUG at fs/ext4/extents_status.c:142! That's the
BUG_ON(es->start + es->len < es->start) in extent_status_end() called
from ext4_es_insert_extent(). tree->cache_es has been freed and poisoned.

This comes from when ext4_es_try_to_merge_left() merges es into leftward
es1, but ext4_es_insert_extent()'s out then updates cache_es to the freed
extent_status. ext4_es_try_to_merge_right() does not pose a problem.

Change ext4_es_try_to_merge_left() to return whichever extent_status
should be recorded in tree->cache_es. Remove cache_es update from
both of them, leaving that to ext4_es_insert_extent()'s out label.

Signed-off-by: Hugh Dickins <[email protected]>
---
fs/ext4/extents_status.c | 15 +++++++--------
1 file changed, 7 insertions(+), 8 deletions(-)

--- mmotm/fs/ext4/extents_status.c 2012-09-26 10:15:29.340071552 -0700
+++ linux/fs/ext4/extents_status.c 2012-09-27 11:52:59.284937056 -0700
@@ -244,24 +244,24 @@ static void ext4_es_free_extent(struct e
kmem_cache_free(ext4_es_cachep, es);
}

-static void ext4_es_try_to_merge_left(struct ext4_es_tree *tree,
- struct extent_status *es)
+static struct extent_status *
+ext4_es_try_to_merge_left(struct ext4_es_tree *tree, struct extent_status *es)
{
struct extent_status *es1;
struct rb_node *node;

node = rb_prev(&es->rb_node);
if (!node)
- return;
+ return es;

es1 = rb_entry(node, struct extent_status, rb_node);
if (es->start == extent_status_end(es1) + 1) {
es1->len += es->len;
rb_erase(&es->rb_node, &tree->root);
- if (es == tree->cache_es)
- tree->cache_es = es1;
ext4_es_free_extent(es);
+ es = es1; /* Caller will update tree->cache_es to this */
}
+ return es;
}

static void ext4_es_try_to_merge_right(struct ext4_es_tree *tree,
@@ -278,9 +278,8 @@ static void ext4_es_try_to_merge_right(s
if (es1->start == extent_status_end(es) + 1) {
es->len += es1->len;
rb_erase(node, &tree->root);
- if (es1 == tree->cache_es)
- tree->cache_es = es;
ext4_es_free_extent(es1);
+ /* Caller will update tree->cache_es to es */
}
}

@@ -318,7 +317,7 @@ int ext4_es_insert_extent(struct inode *
es_debug("cached by [%u/%u)\n", es->start, es->len);
es->start = offset;
es->len += len;
- ext4_es_try_to_merge_left(tree, es);
+ es = ext4_es_try_to_merge_left(tree, es);
goto out;
} else if (es && in_range(offset, es->start, es->len)) {
es_debug("cached by [%u/%u)\n", es->start, es->len);

2012-10-02 21:10:14

by Theodore Ts'o

[permalink] [raw]
Subject: Re: kernel NULL pointer dereference at rb_erase+0x1a3/0x370

The patch series in question has already been dropped, since there
were some other problems found and we decided it wasn't ready for this
merge window.

If you look at the latest linux-next branch, you'll see that those
patches are gone, so it should be good once you upgrade to the latest
linux-next.

- Ted

2012-10-02 21:20:47

by Shuah Khan

[permalink] [raw]
Subject: Re: kernel NULL pointer dereference at rb_erase+0x1a3/0x370

On Tue, 2012-10-02 at 17:10 -0400, Theodore Ts'o wrote:
> The patch series in question has already been dropped, since there
> were some other problems found and we decided it wasn't ready for this
> merge window.
>
> If you look at the latest linux-next branch, you'll see that those
> patches are gone, so it should be good once you upgrade to the latest
> linux-next.
>
> - Ted

Great thanks.

-- Shuah

2012-10-02 21:33:31

by Shuah Khan

[permalink] [raw]
Subject: Re: kernel NULL pointer dereference at rb_erase+0x1a3/0x370

On Tue, 2012-10-02 at 13:27 -0700, Hugh Dickins wrote:
> On Tue, 2 Oct 2012, Shuah Khan wrote:
> > I started seeing the following null pointer dereference on
> > a linux-next sept 21 git and still seeing it on linux-next
> > Sep 27th git.
> >
> > Can be reproduced easily. I have been able to reproduce every
> > time I do a complete build of a kernel on fresh checkout or
> > touch a header file that forces full build.
> >
> > Related to lib/rbtree.c commits that went into September 21 perhaps.
>
> That's a sensible suggestion, but I think probably not.
>
> > I didn't get a chance to investigate this yet, thought I would share
> > just in case others have seen it.
> >
> >
> > [ 2219.660124] BUG: unable to handle kernel NULL pointer dereference at (null)
> > [ 2219.660182] IP: [<ffffffff81323c93>] rb_erase+0x1a3/0x370
> > [ 2219.660209] PGD 73f2f067 PUD 67246067 PMD 0
> > [ 2219.660235] Oops: 0000 [#1] SMP
> > [ 2219.660632] CPU 0
> > [ 2219.660644] Pid: 1660, comm: recordmcount Not tainted 3.6.0-rc6-next-20120921+ #4 Hewlett-Packard HP EliteBook 6930p/30DC
> > [ 2219.664007] Process recordmcount (pid: 1660, threadinfo ffff880073f30000, task ffff88002fe9ada0)
> > [ 2219.664007] Call Trace:
> > [ 2219.664007] [<ffffffff8125af6e>] ext4_es_insert_extent+0x28e/0x2f0
>
> ext4_es_insert_extent: this is probably something I posted a fix for on
> the 27th (copied below). They're reworking that series more thoroughly,
> and it has for the moment dropped out of linux-next, so I expect you
> won't see this error at all with a newer next. But you may want to
> continue with your Sep 27th tree, and try out this patch anyway...

Thanks for looking at both my null reference questions. In my case it is
easier to upgrade, will just do that.

-- Shuah
>
> [PATCH next/mmotm] ext4: fix cache_es after merge_left
>
> Kernel build with CONFIG_DEBUG_SLAB or CONFIG_SLUB_DEBUG slub_debug=FPZ
> gives me kernel BUG at fs/ext4/extents_status.c:142! That's the
> BUG_ON(es->start + es->len < es->start) in extent_status_end() called
> from ext4_es_insert_extent(). tree->cache_es has been freed and poisoned.
>
> This comes from when ext4_es_try_to_merge_left() merges es into leftward
> es1, but ext4_es_insert_extent()'s out then updates cache_es to the freed
> extent_status. ext4_es_try_to_merge_right() does not pose a problem.
>
> Change ext4_es_try_to_merge_left() to return whichever extent_status
> should be recorded in tree->cache_es. Remove cache_es update from
> both of them, leaving that to ext4_es_insert_extent()'s out label.
>
> Signed-off-by: Hugh Dickins <[email protected]>
> ---
> fs/ext4/extents_status.c | 15 +++++++--------
> 1 file changed, 7 insertions(+), 8 deletions(-)
>
> --- mmotm/fs/ext4/extents_status.c 2012-09-26 10:15:29.340071552 -0700
> +++ linux/fs/ext4/extents_status.c 2012-09-27 11:52:59.284937056 -0700
> @@ -244,24 +244,24 @@ static void ext4_es_free_extent(struct e
> kmem_cache_free(ext4_es_cachep, es);
> }
>
> -static void ext4_es_try_to_merge_left(struct ext4_es_tree *tree,
> - struct extent_status *es)
> +static struct extent_status *
> +ext4_es_try_to_merge_left(struct ext4_es_tree *tree, struct extent_status *es)
> {
> struct extent_status *es1;
> struct rb_node *node;
>
> node = rb_prev(&es->rb_node);
> if (!node)
> - return;
> + return es;
>
> es1 = rb_entry(node, struct extent_status, rb_node);
> if (es->start == extent_status_end(es1) + 1) {
> es1->len += es->len;
> rb_erase(&es->rb_node, &tree->root);
> - if (es == tree->cache_es)
> - tree->cache_es = es1;
> ext4_es_free_extent(es);
> + es = es1; /* Caller will update tree->cache_es to this */
> }
> + return es;
> }
>
> static void ext4_es_try_to_merge_right(struct ext4_es_tree *tree,
> @@ -278,9 +278,8 @@ static void ext4_es_try_to_merge_right(s
> if (es1->start == extent_status_end(es) + 1) {
> es->len += es1->len;
> rb_erase(node, &tree->root);
> - if (es1 == tree->cache_es)
> - tree->cache_es = es;
> ext4_es_free_extent(es1);
> + /* Caller will update tree->cache_es to es */
> }
> }
>
> @@ -318,7 +317,7 @@ int ext4_es_insert_extent(struct inode *
> es_debug("cached by [%u/%u)\n", es->start, es->len);
> es->start = offset;
> es->len += len;
> - ext4_es_try_to_merge_left(tree, es);
> + es = ext4_es_try_to_merge_left(tree, es);
> goto out;
> } else if (es && in_range(offset, es->start, es->len)) {
> es_debug("cached by [%u/%u)\n", es->start, es->len);