2013-07-19 17:33:02

by Mikael Pettersson

[permalink] [raw]
Subject: [3.11-rc1 regression] ext4_evict_inode triggers warn_slowpath_common on sparc64

I keep getting the following warning with 3.11-rc1 on sparc64:

------------[ cut here ]------------
WARNING: CPU: 1 PID: 8174 at fs/ext4/inode.c:230 ext4_evict_inode+0x1f0/0x448()
Modules linked in: sunrpc af_packet ipv6 hid_generic snd_ali5451 snd_ac97_codec snd_seq snd_seq_device snd_pcm tg3 snd_timer flash ohci_pci hwmon snd soundcore ptp evdev sg i2c_ali1535 ohci_hcd pps_core snd_page_alloc i2c_core ac97_bus sr_mod cdrom pata_ali libata
CPU: 1 PID: 8174 Comm: xgcc Not tainted 3.11.0-rc1 #1
Call Trace:
[00000000004537b0] warn_slowpath_common+0x4c/0x64
[0000000000540d78] ext4_evict_inode+0x1f0/0x448
[00000000004f3938] evict+0xb8/0x190
[00000000004e99bc] do_unlinkat+0xf4/0x160
[0000000000406174] linux_sparc_syscall32+0x34/0x40
---[ end trace cd72b9e3e68d89e4 ]---

The Comm varies, but the call trace always looks like that. Happens a couple
of times per day, so far. No other ill effects observed. Didn't happen in 3.10
or older kernels.


2013-07-20 01:35:10

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [3.11-rc1 regression] ext4_evict_inode triggers warn_slowpath_common on sparc64

On Fri, Jul 19, 2013 at 07:29:25PM +0200, Mikael Pettersson wrote:
> I keep getting the following warning with 3.11-rc1 on sparc64:
>
> ------------[ cut here ]------------
> WARNING: CPU: 1 PID: 8174 at fs/ext4/inode.c:230 ext4_evict_inode+0x1f0/0x448()
> Modules linked in: sunrpc af_packet ipv6 hid_generic snd_ali5451 snd_ac97_codec snd_seq snd_seq_device snd_pcm tg3 snd_timer flash ohci_pci hwmon snd soundcore ptp evdev sg i2c_ali1535 ohci_hcd pps_core snd_page_alloc i2c_core ac97_bus sr_mod cdrom pata_ali libata
> CPU: 1 PID: 8174 Comm: xgcc Not tainted 3.11.0-rc1 #1
> Call Trace:
> [00000000004537b0] warn_slowpath_common+0x4c/0x64
> [0000000000540d78] ext4_evict_inode+0x1f0/0x448
> [00000000004f3938] evict+0xb8/0x190
> [00000000004e99bc] do_unlinkat+0xf4/0x160
> [0000000000406174] linux_sparc_syscall32+0x34/0x40
> ---[ end trace cd72b9e3e68d89e4 ]---
>
> The Comm varies, but the call trace always looks like that. Happens a couple
> of times per day, so far. No other ill effects observed. Didn't happen in 3.10
> or older kernels.

The fix, commit 822dbba33458cd6ad is already in Linus's tree, and will
be included in -rc2.

Note that this can cause memory corruption caused by a use-after-free.
I've not noticed a problem in my personal testing, but it's been
reported to me that with stress testing (using memory cgroups amonng
other things) and the system wedged when the machine was rebooted
after the tests were completed, and it only came back after the
watchdog timer fired. The fix up was one of the first things which
Linus pulled after releasing -rc1, so you can merge 47188d39b5de to
get the fixes.


Cheers,

- Ted

2013-07-22 11:17:09

by Mikael Pettersson

[permalink] [raw]
Subject: Re: [3.11-rc1 regression] ext4_evict_inode triggers warn_slowpath_common on sparc64

Theodore Ts'o writes:
> On Fri, Jul 19, 2013 at 07:29:25PM +0200, Mikael Pettersson wrote:
> > I keep getting the following warning with 3.11-rc1 on sparc64:
> >
> > ------------[ cut here ]------------
> > WARNING: CPU: 1 PID: 8174 at fs/ext4/inode.c:230 ext4_evict_inode+0x1f0/0x448()
> > Modules linked in: sunrpc af_packet ipv6 hid_generic snd_ali5451 snd_ac97_codec snd_seq snd_seq_device snd_pcm tg3 snd_timer flash ohci_pci hwmon snd soundcore ptp evdev sg i2c_ali1535 ohci_hcd pps_core snd_page_alloc i2c_core ac97_bus sr_mod cdrom pata_ali libata
> > CPU: 1 PID: 8174 Comm: xgcc Not tainted 3.11.0-rc1 #1
> > Call Trace:
> > [00000000004537b0] warn_slowpath_common+0x4c/0x64
> > [0000000000540d78] ext4_evict_inode+0x1f0/0x448
> > [00000000004f3938] evict+0xb8/0x190
> > [00000000004e99bc] do_unlinkat+0xf4/0x160
> > [0000000000406174] linux_sparc_syscall32+0x34/0x40
> > ---[ end trace cd72b9e3e68d89e4 ]---
> >
> > The Comm varies, but the call trace always looks like that. Happens a couple
> > of times per day, so far. No other ill effects observed. Didn't happen in 3.10
> > or older kernels.
>
> The fix, commit 822dbba33458cd6ad is already in Linus's tree, and will
> be included in -rc2.
>
> Note that this can cause memory corruption caused by a use-after-free.
> I've not noticed a problem in my personal testing, but it's been
> reported to me that with stress testing (using memory cgroups amonng
> other things) and the system wedged when the machine was rebooted
> after the tests were completed, and it only came back after the
> watchdog timer fired. The fix up was one of the first things which
> Linus pulled after releasing -rc1, so you can merge 47188d39b5de to
> get the fixes.

Thanks! Installing -rc2 today..

/Mikael