2011-05-13 09:17:06

by Amir Goldstein

[permalink] [raw]
Subject: Regression with ext4 in kernel 2.6.39-rc7? (Was: testing ext4 master branch)

Hi All,

I double checked myself and made a clean build of 2.6.39-rc7
and I am still getting this crash below with xfstest 232.
All xfstests used to pass when I was runing kernel 2.6.38, so
this must be a regression.

Unfortunately, I cannot double check there is no crash with previous kernel,
because I lost connection with my test server and there is no one to
push the reset button over the weekend.

Can anyone try to reproduce the error with xfstest 005 and the crash
with xfstest 232?

Thanks,
Amir.

[ 1319.112544] EXT4-fs (sda8): mounted filesystem with ordered data
mode. Opts: acl,user_xattr,usrquota,grpquota
[ 1319.270023] EXT4-fs (sda8): re-mounted. Opts: (null)
[ 1319.271464] EXT4-fs (sda8): re-mounted. Opts: (null)
[ 1368.214854] BUG: unable to handle kernel NULL pointer dereference
at 0000000000000018
[ 1368.219348] IP: [<ffffffff8122e152>] ext4_quota_off+0x42/0xd0
[ 1368.221628] PGD 0
[ 1368.222978] Oops: 0000 [#2] SMP
[ 1368.222978] last sysfs file:
/sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
[ 1368.222978] CPU 0
[ 1368.222978] Modules linked in: binfmt_misc parport_pc ppdev
snd_hda_codec_realtek snd_hda_intel snd_hda_codec i915 snd_hwdep
snd_pcm drm_kms_helper drm snd_seq_midi snd_rawmidi e1000e
snd_seq_midi_event i2c_algo_bit snd_seq lp firewire_ohci firewire_core
snd_timer snd_seq_device snd soundcore snd_page_alloc psmouse parport
pata_marvell usbhid hid video intel_agp intel_gtt tpm_tis crc_itu_t
serio_raw tpm tpm_bios
[ 1368.222978]
[ 1368.222978] Pid: 2691, comm: quotaon Tainted: G M D
2.6.39-rc7 #9 /DQ35JO
[ 1368.222978] RIP: 0010:[<ffffffff8122e152>] [<ffffffff8122e152>]
ext4_quota_off+0x42/0xd0
[ 1368.222978] RSP: 0018:ffff8800c4bb3e28 EFLAGS: 00010292
[ 1368.222978] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000018
[ 1368.222978] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000246
[ 1368.222978] RBP: ffff8800c4bb3e48 R08: 0000000000000001 R09: 0000000000000000
[ 1368.222978] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880114576000
[ 1368.222978] R13: ffff880114576000 R14: 0000000000000001 R15: 0000000000000000
[ 1368.222978] FS: 00007f5c2bf97720(0000) GS:ffff88012bc00000(0000)
knlGS:0000000000000000
[ 1368.222978] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1368.222978] CR2: 0000000000000018 CR3: 00000000c693f000 CR4: 00000000000006f0
[ 1368.222978] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1368.222978] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1368.222978] Process quotaon (pid: 2691, threadinfo
ffff8800c4bb2000, task ffff880116bc5ee0)
[ 1368.222978] Stack:
[ 1368.222978] 0000000000800003 0000000000000001 ffff880114576000
00000000ffffffda
[ 1368.222978] ffff8800c4bb3ef8 ffffffff811c9e05 0000000000000000
0000000000000000
[ 1368.222978] ffff8800c4bb3e78 ffff880114576068 ffff880115009800
ffff880114576068
[ 1368.222978] Call Trace:
[ 1368.222978] [<ffffffff811c9e05>] do_quotactl+0x4e5/0x560
[ 1368.222978] [<ffffffff815d376c>] ? down_read+0x4c/0x70
[ 1368.222978] [<ffffffff811711cf>] ? get_super+0x9f/0xd0
[ 1368.222978] [<ffffffff81189f78>] ? iput+0x48/0x200
[ 1368.222978] [<ffffffff811c9f4c>] sys_quotactl+0xcc/0x1a0
[ 1368.222978] [<ffffffff8116be26>] ? filp_close+0x66/0x90
[ 1368.222978] [<ffffffff812fd76e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 1368.222978] [<ffffffff815dd2c2>] system_call_fastpath+0x16/0x1b
[ 1368.222978] Code: 89 74 24 18 0f 1f 44 00 00 48 63 c6 49 89 fc 41
89 f6 48 8b 9c c7 60 03 00 00 48 8b 87 90 04 00 00 f6 40 73 08 0f 85
7e 00 00 00
[ 1368.222978] 8b 7b 18 be 01 00 00 00 e8 c0 fb ff ff 48 3d 00 f0 ff ff 49
[ 1368.222978] RIP [<ffffffff8122e152>] ext4_quota_off+0x42/0xd0
[ 1368.222978] RSP <ffff8800c4bb3e28>
[ 1368.222978] CR2: 0000000000000018
[ 1368.310246] ---[ end trace 62a147f050ade229 ]---


On Fri, May 13, 2011 at 12:19 AM, Amir Goldstein <[email protected]> wrote:
> On Thu, May 12, 2011 at 9:03 PM, Amir Goldstein <[email protected]> wrote:
>> On Thu, May 12, 2011 at 7:27 PM, Amir Goldstein <[email protected]> wrote:
>>> Hi Jan,
>>>
>>> During testing of Ted's master branch merged with 2.6.39-rc7, I
>>> encountered 2 errors,
>>> before the system was hung.
>>>
>>> One error is consistent in xfstest 005 (Test symlinks & ELOOP):
>>> QA output created by 005
>>> *** touch deep symlinks
>>>
>>> No ELOOP? ?Unexpected!
>>>
>>> *** touch recusive symlinks
>>>
>>> ELOOP returned. ?Good.
>>>
>>>
>>> The other error is critical and you may be able to provide some input:
>>> while running xfstest 232 (Run fsstress with quotas enabled and verify
>>> accounted quotas in the end):
>>>
>>
>> FYI, this crash reproduced the second time I tried to run the test.
>> Now building kernel 2.6.39-rc7 (without ext4 master branch changes).
>> If my remote server doesn't hang over the weekend I will let you know
>> the test result.
>>
>
> Both bugs are reproduced on 2.6.39-rc7.
> Does anybody else see those results???
>
> Amir.
>
>>>
>>> [18339.351033] EXT4-fs (sda8): mounted filesystem with ordered data
>>> mode. Opts: acl,user_xattr,usrquota,grpquota
>>> [18339.386612] EXT4-fs (sda8): re-mounted. Opts: (null)
>>> [18339.397322] EXT4-fs (sda8): re-mounted. Opts: (null)
>>> [18406.012595] BUG: unable to handle kernel NULL pointer dereference
>>> at 0000000000000018
>>> [18406.012664] IP: [<ffffffff8122e202>] ext4_quota_off+0x42/0xd0
>>> [18406.012711] PGD 0
>>> [18406.012730] Oops: 0000 [#1] SMP
>>> [18406.012810] CPU 2
>>> [18406.012826] Modules linked in: next4 binfmt_misc parport_pc ppdev
>>> snd_hda_codec_realtek snd_hda_intel snd_hda_codec i915 snd_hwdep
>>> drm_kms_helper snd_pcm snd_seq_midi drm firewire_ohci firewire_core
>>> usbhid snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device
>>> snd e1000e psmouse tpm_tis serio_raw lp i2c_algo_bit hid tpm intel_agp
>>> pata_marvell parport soundcore crc_itu_t tpm_bios intel_gtt video
>>> snd_page_alloc
>>> [18406.013187]
>>> [18406.013201] Pid: 26309, comm: quotaon Tainted: G ? M
>>> 2.6.39-rc7+ #6 ? ? ? ? ? ? ? ? ?/DQ35JO
>>> [18406.013269] RIP: 0010:[<ffffffff8122e202>] ?[<ffffffff8122e202>]
>>> ext4_quota_off+0x42/0xd0
>>> [18406.013325] RSP: 0018:ffff88011cd57e28 ?EFLAGS: 00010292
>>> [18406.013361] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000018
>>> [18406.013406] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000246
>>> [18406.013451] RBP: ffff88011cd57e48 R08: 0000000000000001 R09: 0000000000000000
>>> [18406.013497] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800ca9b8800
>>> [18406.013541] R13: ffff8800ca9b8800 R14: 0000000000000001 R15: 0000000000000000
>>> [18406.013587] FS: ?00007f602698b720(0000) GS:ffff88012bd00000(0000)
>>> knlGS:0000000000000000
>>> [18406.013639] CS: ?0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>>> [18406.013676] CR2: 0000000000000018 CR3: 000000011332b000 CR4: 00000000000006e0
>>> [18406.013721] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> [18406.013766] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>> [18406.013812] Process quotaon (pid: 26309, threadinfo
>>> ffff88011cd56000, task ffff880111bddee0)
>>> [18406.013864] Stack:
>>> [18406.013880] ?0000000000800003 0000000000000001 ffff8800ca9b8800
>>> 00000000ffffffda
>>> [18406.013939] ?ffff88011cd57ef8 ffffffff811c9e05 0000000000000000
>>> 0000000000000000
>>> [18406.013998] ?ffff88011cd57e78 ffff8800ca9b8868 ffff880124621e00
>>> ffff8800ca9b8868
>>> [18406.014057] Call Trace:
>>> [18406.014079] ?[<ffffffff811c9e05>] do_quotactl+0x4e5/0x560
>>> [18406.014118] ?[<ffffffff815d2b1c>] ? down_read+0x4c/0x70
>>> [18406.014155] ?[<ffffffff811711cf>] ? get_super+0x9f/0xd0
>>> [18406.014190] ?[<ffffffff81189f78>] ? iput+0x48/0x200
>>> [18406.014224] ?[<ffffffff811c9f4c>] sys_quotactl+0xcc/0x1a0
>>> [18406.014260] ?[<ffffffff8116be26>] ? filp_close+0x66/0x90
>>> [18406.014298] ?[<ffffffff812fcb1e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
>>> [18406.014343] ?[<ffffffff815dc642>] system_call_fastpath+0x16/0x1b
>>> [18406.014382] Code: 89 74 24 18 0f 1f 44 00 00 48 63 c6 49 89 fc 41
>>> 89 f6 48 8b 9c c7 60 03 00 00 48 8b 87 90 04 00 00 f6 40 73 08 0f 85
>>> 7e 00 00 00
>>> [18406.014601] ?8b 7b 18 be 01 00 00 00 e8 c0 fb ff ff 48 3d 00 f0 ff ff 49
>>> [18406.014712] RIP ?[<ffffffff8122e202>] ext4_quota_off+0x42/0xd0
>>> [18406.014756] ?RSP <ffff88011cd57e28>
>>> [18406.014780] CR2: 0000000000000018
>>> [18406.079351] ---[ end trace 2924f13a8b419b9a ]---
>>>
>>>
>>> The test was hung at quotacheck -u -g for a long time, so I dumped
>>> waiting tasks and got:
>>>
>>>
>>> [21278.671419] SysRq : Show Blocked State
>>> [21278.671427] ? task ? ? ? ? ? ? ? ? ? ? ? ?PC stack ? pid father
>>> [21278.671457] quotacheck ? ? ?D 00000001001ba0b0 ? ? 0 26321 ?26123 0x00000000
>>> [21278.671464] ?ffff8801123f7da8 0000000000000046 ffff8801123f7df8
>>> 0000000017059fa0
>>> [21278.671472] ?ffff880100000000 ffff8801123f7fd8 ffff8801123f6000
>>> ffff8801123f7fd8
>>> [21278.671480] ?ffff880124f43f40 ffff880117059fa0 ffff8800ca9b8870
>>> 00000001ca9b8868
>>> [21278.671487] Call Trace:
>>> [21278.671498] ?[<ffffffff815d3775>] rwsem_down_failed_common+0xc5/0x160
>>> [21278.671504] ?[<ffffffff815d3823>] rwsem_down_write_failed+0x13/0x20
>>> [21278.671511] ?[<ffffffff812fca83>] call_rwsem_down_write_failed+0x13/0x20
>>> [21278.671517] ?[<ffffffff811904ce>] ? do_mount+0x21e/0x7e0
>>> [21278.671523] ?[<ffffffff815d2ac5>] ? down_write+0x65/0x70
>>> [21278.671527] ?[<ffffffff811904ce>] ? do_mount+0x21e/0x7e0
>>> [21278.671532] ?[<ffffffff811904ce>] do_mount+0x21e/0x7e0
>>> [21278.671537] ?[<ffffffff812fce81>] ? strncpy_from_user+0x31/0x40
>>> [21278.671543] ?[<ffffffff81179dd4>] ? getname_flags+0x74/0x240
>>> [21278.671548] ?[<ffffffff81190e50>] sys_mount+0x90/0xe0
>>> [21278.671554] ?[<ffffffff815dc642>] system_call_fastpath+0x16/0x1b
>>>
>>>
>>> I have had problems running xfstests on my machine (now Ubuntu 11.4).
>>> umount keep failing on some specific tests (sometimes) and reporting:
>>> +umount: /mnt/test/scratch: device is busy.
>>> + ? ? ? ?(In some cases useful info about processes that use
>>> + ? ? ? ? the device is found by lsof(8) or fuser(1))
>>>
>>> Naturally, those partitions are dedicated for xfstests.
>>> I was never able to solve this problem so I set USE_REMOUNT=1 to avoid umount
>>> at least on the TEST partition.
>>>
>>> Any ideas?
>>>
>>> Amir.
>>>
>>
>


2011-05-13 09:27:18

by Sedat Dilek

[permalink] [raw]
Subject: Re: Regression with ext4 in kernel 2.6.39-rc7? (Was: testing ext4 master branch)

On Fri, May 13, 2011 at 11:17 AM, Amir Goldstein <[email protected]> wrote:
> Hi All,
>
> I double checked myself and made a clean build of 2.6.39-rc7
> and I am still getting this crash below with xfstest 232.
> All xfstests used to pass when I was runing kernel 2.6.38, so
> this must be a regression.
>
> Unfortunately, I cannot double check there is no crash with previous kernel,
> because I lost connection with my test server and there is no one to
> push the reset button over the weekend.
>
> Can anyone try to reproduce the error with xfstest 005 and the crash
> with xfstest 232?
>
> Thanks,
> Amir.
>

Please attach your kernel-config for reproducible testing!

Is this really with 2.6.39-rc7 vanilla kernel (no extra patches!)?

Which xfstest (GIT) version is this?
Which versions of xfs{progs, -libs, dump}?

Here I am on i386 and can help with testing if this helps.

- Sedat -

2011-05-13 10:34:53

by Amir Goldstein

[permalink] [raw]
Subject: Re: Regression with ext4 in kernel 2.6.39-rc7? (Was: testing ext4 master branch)

On Fri, May 13, 2011 at 12:27 PM, Sedat Dilek
<[email protected]> wrote:
> On Fri, May 13, 2011 at 11:17 AM, Amir Goldstein <[email protected]> wrote:
>> Hi All,
>>
>> I double checked myself and made a clean build of 2.6.39-rc7
>> and I am still getting this crash below with xfstest 232.
>> All xfstests used to pass when I was runing kernel 2.6.38, so
>> this must be a regression.
>>
>> Unfortunately, I cannot double check there is no crash with previous kernel,
>> because I lost connection with my test server and there is no one to
>> push the reset button over the weekend.
>>
>> Can anyone try to reproduce the error with xfstest 005 and the crash
>> with xfstest 232?
>>
>> Thanks,
>> Amir.
>>
>
> Please attach your kernel-config for reproducible testing!

Well, maybe next time, as I said, now my remote testing server
is unreachable.
Not that it helps, but my config file originated from Ubuntu 10.10
kernel (2.6.35-generic) and I added some kernel hacking options,
like LOCKDEP and friends.
I had CONFIG_EXT4_DEBUG and CONFIG_JBD2_DEBUG on.
All the rest was just defaults chosen by make oldconfig.
If it helps, I can send you a config file from my laptop, which
is supposed to be similar to the one on the server
(before the changes made my make oldconfig).

>
> Is this really with 2.6.39-rc7 vanilla kernel (no extra patches!)?
>

git checkout v2.6.39-rc7
make clean && make

Reported kernel version doesn't lie when you're building from GIT
and I didn't edit any files manually.


> Which xfstest (GIT) version is this?

I was using the current (GIT) version, fetched it 2 days ago.
I do have some weird problem on my system, which I reported to XFS list
a while back and couldn't find a cure:
the execution of "umount ; fsck" after the tests sometimes fails to
umount (EBUSY)
after specific tests.

I suspect that this problem in my system is also related to the crash,
because test 232 does 2 remounts and that may cause some race on my system.


> Which versions of xfs{progs, -libs, dump}?

Whatever tools came with the Ubuntu 11.4 upgrade... (cannot check now)
I suppose e2fsprogs version is what matters when I am checking ext4 though.

>
> Here I am on i386 and can help with testing if this helps.
>

Great! so did you try running xfstests 005 and 232?
Did they pass?

Thanks,
Amir.

> - Sedat -
>

2011-05-13 14:56:12

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Regression with ext4 in kernel 2.6.39-rc7? (Was: testing ext4 master branch)

On Fri, May 13, 2011 at 12:17:03PM +0300, Amir Goldstein wrote:
>
> Can anyone try to reproduce the error with xfstest 005 and the crash
> with xfstest 232?

Xfstest #5 broke because of a change in the VFS, which now allows up
to 40 nested symlinks. So that's a matter of your xfstests being too
old.

The version of xfstests I've been using on my KVM box is too old to
have test 232, so I haven't been able to test it. I've been trying to
use a newer version of xfstests, but xfstests doesn't build on either
Ubuntu 10.04 (LTS), Debian stable, or Debian unstable, due to the use
of newer XFS ioctl's and xfsctl's that aren't defined in the system
header files. The fact that it doesn't work on Ubuntu LTS and Stable
is not that surprising, I suppose, but I was a bit disappointed that
it doesn't work on Debian unstable.

Since I don't have a Fedora system handy --- which header file are
things like "struct xfs_flock64" supposed to be defined these days?

- Ted

2011-05-13 15:25:25

by Eric Sandeen

[permalink] [raw]
Subject: Re: Regression with ext4 in kernel 2.6.39-rc7? (Was: testing ext4 master branch)

On 5/13/11 9:56 AM, Ted Ts'o wrote:
> On Fri, May 13, 2011 at 12:17:03PM +0300, Amir Goldstein wrote:
>>
>> Can anyone try to reproduce the error with xfstest 005 and the crash
>> with xfstest 232?
>
> Xfstest #5 broke because of a change in the VFS, which now allows up
> to 40 nested symlinks. So that's a matter of your xfstests being too
> old.
>
> The version of xfstests I've been using on my KVM box is too old to
> have test 232, so I haven't been able to test it. I've been trying to
> use a newer version of xfstests, but xfstests doesn't build on either
> Ubuntu 10.04 (LTS), Debian stable, or Debian unstable, due to the use
> of newer XFS ioctl's and xfsctl's that aren't defined in the system
> header files. The fact that it doesn't work on Ubuntu LTS and Stable
> is not that surprising, I suppose, but I was a bit disappointed that
> it doesn't work on Debian unstable.

I missed that bug report :) If you can send me the details of the
failures we can probably add configure tests for any new ioctls
that are causing build failures.

> Since I don't have a Fedora system handy --- which header file are
> things like "struct xfs_flock64" supposed to be defined these days?

Hm, well, on my Fedora system, /usr/include/xfs/xfs_fs.h, from xfsprogs-devel.

I don't think that has changed in a very long time...

I also package an xfsprogs-qa-devel which has some additional pieces in
it to support xfstests. Debian could do the same ... "make install-qa"
in xfsprogs puts those bits into the root.

-Eric

> - Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html


2011-05-13 17:25:04

by Amir Goldstein

[permalink] [raw]
Subject: Re: Regression with ext4 in kernel 2.6.39-rc7? (Was: testing ext4 master branch)

On Fri, May 13, 2011 at 6:25 PM, Eric Sandeen <[email protected]> wrote:
> On 5/13/11 9:56 AM, Ted Ts'o wrote:
>> On Fri, May 13, 2011 at 12:17:03PM +0300, Amir Goldstein wrote:
>>>
>>> Can anyone try to reproduce the error with xfstest 005 and the crash
>>> with xfstest 232?
>>
>> Xfstest #5 broke because of a change in the VFS, which now allows up
>> to 40 nested symlinks. ?So that's a matter of your xfstests being too
>> old.
>>
>> The version of xfstests I've been using on my KVM box is too old to
>> have test 232, so I haven't been able to test it. ?I've been trying to
>> use a newer version of xfstests, but xfstests doesn't build on either
>> Ubuntu 10.04 (LTS), Debian stable, or Debian unstable, due to the use
>> of newer XFS ioctl's and xfsctl's that aren't defined in the system
>> header files. ?The fact that it doesn't work on Ubuntu LTS and Stable
>> is not that surprising, I suppose, but I was a bit disappointed that
>> it doesn't work on Debian unstable.
>
> I missed that bug report :) ?If you can send me the details of the
> failures we can probably add configure tests for any new ioctls
> that are causing build failures.
>
>> Since I don't have a Fedora system handy --- which header file are
>> things like "struct xfs_flock64" supposed to be defined these days?
>
> Hm, well, on my Fedora system, /usr/include/xfs/xfs_fs.h, from xfsprogs-devel.
>
> I don't think that has changed in a very long time...
>
> I also package an xfsprogs-qa-devel which has some additional pieces in
> it to support xfstests. ?Debian could do the same ... "make install-qa"
> in xfsprogs puts those bits into the root.
>

After xfstests failed to build on Ubuntu 10.10, I followed the advice omitted
by the build script to run "make install-qa", to solve the problem.
It took me a while to figure exactly where I should run the command,
but in the end I pulled the xfsprogs tree, ran "make; make install;
make install-qa"
and from there on things were looking better.

Amir.

> -Eric
>
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?- Ted
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to [email protected]
>> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>
>

2011-05-13 17:28:38

by Eric Sandeen

[permalink] [raw]
Subject: Re: Regression with ext4 in kernel 2.6.39-rc7? (Was: testing ext4 master branch)

On 5/13/11 12:25 PM, Amir Goldstein wrote:
> On Fri, May 13, 2011 at 6:25 PM, Eric Sandeen <[email protected]> wrote:
>> On 5/13/11 9:56 AM, Ted Ts'o wrote:
>>> On Fri, May 13, 2011 at 12:17:03PM +0300, Amir Goldstein wrote:
>>>>
>>>> Can anyone try to reproduce the error with xfstest 005 and the crash
>>>> with xfstest 232?
>>>
>>> Xfstest #5 broke because of a change in the VFS, which now allows up
>>> to 40 nested symlinks. So that's a matter of your xfstests being too
>>> old.
>>>
>>> The version of xfstests I've been using on my KVM box is too old to
>>> have test 232, so I haven't been able to test it. I've been trying to
>>> use a newer version of xfstests, but xfstests doesn't build on either
>>> Ubuntu 10.04 (LTS), Debian stable, or Debian unstable, due to the use
>>> of newer XFS ioctl's and xfsctl's that aren't defined in the system
>>> header files. The fact that it doesn't work on Ubuntu LTS and Stable
>>> is not that surprising, I suppose, but I was a bit disappointed that
>>> it doesn't work on Debian unstable.
>>
>> I missed that bug report :) If you can send me the details of the
>> failures we can probably add configure tests for any new ioctls
>> that are causing build failures.
>>
>>> Since I don't have a Fedora system handy --- which header file are
>>> things like "struct xfs_flock64" supposed to be defined these days?
>>
>> Hm, well, on my Fedora system, /usr/include/xfs/xfs_fs.h, from xfsprogs-devel.
>>
>> I don't think that has changed in a very long time...
>>
>> I also package an xfsprogs-qa-devel which has some additional pieces in
>> it to support xfstests. Debian could do the same ... "make install-qa"
>> in xfsprogs puts those bits into the root.
>>
>
> After xfstests failed to build on Ubuntu 10.10, I followed the advice omitted
> by the build script to run "make install-qa", to solve the problem.
> It took me a while to figure exactly where I should run the command,
> but in the end I pulled the xfsprogs tree, ran "make; make install;
> make install-qa"
> and from there on things were looking better.

I can ask Nathan if he can package the qa bits for debian.

Or, you all could just use Fedora ;)

-Eric

2011-05-13 17:31:56

by Eric Sandeen

[permalink] [raw]
Subject: Re: Regression with ext4 in kernel 2.6.39-rc7? (Was: testing ext4 master branch)

On 5/13/11 12:28 PM, Eric Sandeen wrote:
>> After xfstests failed to build on Ubuntu 10.10, I followed the advice omitted
>> > by the build script to run "make install-qa", to solve the problem.
>> > It took me a while to figure exactly where I should run the command,
>> > but in the end I pulled the xfsprogs tree, ran "make; make install;
>> > make install-qa"
>> > and from there on things were looking better.
> I can ask Nathan if he can package the qa bits for debian.
>
> Or, you all could just use Fedora ;)
>
> -Eric

oh, or maybe all those files are already in xfslibs-dev:

http://packages.debian.org/lenny/amd64/xfslibs-dev/filelist

Can someone who has encountered xfstests build problems send a bug report to the xfs list please?
I'm sure we can get it sorted.

-Eric

2011-05-13 17:37:16

by Amir Goldstein

[permalink] [raw]
Subject: Re: Regression with ext4 in kernel 2.6.39-rc7? (Was: testing ext4 master branch)

On Fri, May 13, 2011 at 8:28 PM, Eric Sandeen <[email protected]> wrote:
> On 5/13/11 12:25 PM, Amir Goldstein wrote:
>> On Fri, May 13, 2011 at 6:25 PM, Eric Sandeen <[email protected]> wrote:
>>> On 5/13/11 9:56 AM, Ted Ts'o wrote:
>>>> On Fri, May 13, 2011 at 12:17:03PM +0300, Amir Goldstein wrote:
>>>>>
>>>>> Can anyone try to reproduce the error with xfstest 005 and the crash
>>>>> with xfstest 232?
>>>>
>>>> Xfstest #5 broke because of a change in the VFS, which now allows up
>>>> to 40 nested symlinks. ?So that's a matter of your xfstests being too
>>>> old.
>>>>
>>>> The version of xfstests I've been using on my KVM box is too old to
>>>> have test 232, so I haven't been able to test it. ?I've been trying to
>>>> use a newer version of xfstests, but xfstests doesn't build on either
>>>> Ubuntu 10.04 (LTS), Debian stable, or Debian unstable, due to the use
>>>> of newer XFS ioctl's and xfsctl's that aren't defined in the system
>>>> header files. ?The fact that it doesn't work on Ubuntu LTS and Stable
>>>> is not that surprising, I suppose, but I was a bit disappointed that
>>>> it doesn't work on Debian unstable.
>>>
>>> I missed that bug report :) ?If you can send me the details of the
>>> failures we can probably add configure tests for any new ioctls
>>> that are causing build failures.
>>>
>>>> Since I don't have a Fedora system handy --- which header file are
>>>> things like "struct xfs_flock64" supposed to be defined these days?
>>>
>>> Hm, well, on my Fedora system, /usr/include/xfs/xfs_fs.h, from xfsprogs-devel.
>>>
>>> I don't think that has changed in a very long time...
>>>
>>> I also package an xfsprogs-qa-devel which has some additional pieces in
>>> it to support xfstests. ?Debian could do the same ... "make install-qa"
>>> in xfsprogs puts those bits into the root.
>>>
>>
>> After xfstests failed to build on Ubuntu 10.10, ?I followed the advice omitted
>> by the build script to run "make install-qa", to solve the problem.
>> It took me a while to figure exactly where I should run the command,
>> but in the end I pulled the xfsprogs tree, ran "make; make install;
>> make install-qa"
>> and from there on things were looking better.
>
> I can ask Nathan if he can package the qa bits for debian.
>
> Or, you all could just use Fedora ;)

As a side note, I started using Ubuntu because their Live CD can build
kernel modules (with no need to install any packages). I found it very
convenient for testing my module with various kernels.
Fedora Live CD wasn't as useful.

>
> -Eric
>

2011-05-13 22:49:49

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Regression with ext4 in kernel 2.6.39-rc7? (Was: testing ext4 master branch)

On Fri, May 13, 2011 at 08:25:02PM +0300, Amir Goldstein wrote:
>
> After xfstests failed to build on Ubuntu 10.10, I followed the
> advice omitted by the build script to run "make install-qa", to
> solve the problem. It took me a while to figure exactly where I
> should run the command, but in the end I pulled the xfsprogs tree,
> ran "make; make install; make install-qa" and from there on things
> were looking better.

I did that, but fsstress doesn't pull in the needed xfs/xfs_fs.h
header file. So it still dies.

- Ted

2011-05-14 07:11:57

by Amir Goldstein

[permalink] [raw]
Subject: Re: Regression with ext4 in kernel 2.6.39-rc7? (Was: testing ext4 master branch)

On Sat, May 14, 2011 at 1:49 AM, Ted Ts'o <[email protected]> wrote:
> On Fri, May 13, 2011 at 08:25:02PM +0300, Amir Goldstein wrote:
>>
>> After xfstests failed to build on Ubuntu 10.10, I followed the
>> advice omitted by the build script to run "make install-qa", to
>> solve the problem. ?It took me a while to figure exactly where I
>> should run the command, but in the end I pulled the xfsprogs tree,
>> ran "make; make install; make install-qa" and from there on things
>> were looking better.
>
> I did that, but fsstress doesn't pull in the needed xfs/xfs_fs.h
> header file. ?So it still dies.
>
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?- Ted
>

Well, anyway, the regression has to be from commit 21f97697:

ext4: remove unnecessary [cm]time update of quota file

because before that commit ext4_quota_off() was too short to
have a bug at ext4_quota_off+0x42/0xd0.

Jan, where are you? don't make me debug this...

Amir.

2011-05-14 10:16:17

by Amir Goldstein

[permalink] [raw]
Subject: Re: Regression with ext4 in kernel 2.6.39-rc7? (Was: testing ext4 master branch)



Sent from my iPhone

On 14/05/2011, at 10:11, Amir Goldstein <[email protected]> wrote:

> On Sat, May 14, 2011 at 1:49 AM, Ted Ts'o <[email protected]> wrote:
>> On Fri, May 13, 2011 at 08:25:02PM +0300, Amir Goldstein wrote:
>>>
>>> After xfstests failed to build on Ubuntu 10.10, I followed the
>>> advice omitted by the build script to run "make install-qa", to
>>> solve the problem. It took me a while to figure exactly where I
>>> should run the command, but in the end I pulled the xfsprogs tree,
>>> ran "make; make install; make install-qa" and from there on things
>>> were looking better.
>>
>> I did that, but fsstress doesn't pull in the needed xfs/xfs_fs.h
>> header file. So it still dies.
>>
>> - Ted
>>
>
> Well, anyway, the regression has to be from commit 21f97697:
>
> ext4: remove unnecessary [cm]time update of quota file
>
> because before that commit ext4_quota_off() was too short to
> have a bug at ext4_quota_off+0x42/0xd0.
>
> Jan, where are you? don't make me debug this...

So I guess that Jan's patch is missing
If (!inode)
goto out;

So much for writing patches from my mobile ;-)

>
> Amir.

2011-05-16 09:43:47

by Jan Kara

[permalink] [raw]
Subject: Re: Regression with ext4 in kernel 2.6.39-rc7? (Was: testing ext4 master branch)

On Sat 14-05-11 13:16:25, Amir Goldstein wrote:
> Sent from my iPhone
>
> On 14/05/2011, at 10:11, Amir Goldstein <[email protected]> wrote:
>
> >On Sat, May 14, 2011 at 1:49 AM, Ted Ts'o <[email protected]> wrote:
> >>On Fri, May 13, 2011 at 08:25:02PM +0300, Amir Goldstein wrote:
> >>>
> >>>After xfstests failed to build on Ubuntu 10.10, I followed the
> >>>advice omitted by the build script to run "make install-qa", to
> >>>solve the problem. It took me a while to figure exactly where I
> >>>should run the command, but in the end I pulled the xfsprogs tree,
> >>>ran "make; make install; make install-qa" and from there on things
> >>>were looking better.
> >>
> >>I did that, but fsstress doesn't pull in the needed xfs/xfs_fs.h
> >>header file. So it still dies.
> >>
> >> - Ted
> >>
> >
> >Well, anyway, the regression has to be from commit 21f97697:
> >
> >ext4: remove unnecessary [cm]time update of quota file
> >
> >because before that commit ext4_quota_off() was too short to
> >have a bug at ext4_quota_off+0x42/0xd0.
> >
> >Jan, where are you? don't make me debug this...
>
> So I guess that Jan's patch is missing
> If (!inode)
> goto out;
Exactly, that is it. I cannot trigger the problem anymore with the patch.
I just wonder how come you've spotted the problem because test 232 does not
trigger the problem for me - it is triggered when you run quotaoff without
running quotaon and that does not happen with test 232.. Anyway, with the
attached patch running quotaoff on filesystem without quotas turned on
works fine whereas previously it oopsed.

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR


Attachments:
(No filename) (1.64 kB)
0001-ext4-Fix-oops-in-ext4_quota_off.patch (940.00 B)
Download all attachments

2011-05-16 09:59:58

by Amir Goldstein

[permalink] [raw]
Subject: Re: Regression with ext4 in kernel 2.6.39-rc7? (Was: testing ext4 master branch)

On Mon, May 16, 2011 at 12:43 PM, Jan Kara <[email protected]> wrote:
> On Sat 14-05-11 13:16:25, Amir Goldstein wrote:
>> Sent from my iPhone
>>
>> On 14/05/2011, at 10:11, Amir Goldstein <[email protected]> wrote:
>>
>> >On Sat, May 14, 2011 at 1:49 AM, Ted Ts'o <[email protected]> wrote:
>> >>On Fri, May 13, 2011 at 08:25:02PM +0300, Amir Goldstein wrote:
>> >>>
>> >>>After xfstests failed to build on Ubuntu 10.10, I followed the
>> >>>advice omitted by the build script to run "make install-qa", to
>> >>>solve the problem. ?It took me a while to figure exactly where I
>> >>>should run the command, but in the end I pulled the xfsprogs tree,
>> >>>ran "make; make install; make install-qa" and from there on things
>> >>>were looking better.
>> >>
>> >>I did that, but fsstress doesn't pull in the needed xfs/xfs_fs.h
>> >>header file. ?So it still dies.
>> >>
>> >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? - Ted
>> >>
>> >
>> >Well, anyway, the regression has to be from commit ?21f97697:
>> >
>> >ext4: remove unnecessary [cm]time update of quota file
>> >
>> >because before that commit ext4_quota_off() was too short to
>> >have a bug at ext4_quota_off+0x42/0xd0.
>> >
>> >Jan, where are you? don't make me debug this...
>>
>> So I guess that Jan's patch is missing
>> If (!inode)
>> ? ? goto out;
> ?Exactly, that is it. I cannot trigger the problem anymore with the patch.
> I just wonder how come you've spotted the problem because test 232 does not
> trigger the problem for me -

It does not surprise me.
As I wrote earlier, my system (Ubuntu 11.4) is behaving strange.
Some tests fail the fsck at the end, because the filesystem fails to umount
immediately after the test (lsof shows nothing and manual umount succeeds).
It's probably the cause for 232 test to fail (even after the crash was fixed)
and the cause for quotaoff to be called without quotaon in test 232 on
my system.

> it is triggered when you run quotaoff without
> running quotaon and that does not happen with test 232.. Anyway, with the
> attached patch running quotaoff on filesystem without quotas turned on
> works fine whereas previously it oopsed.
>
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Honza
> --
> Jan Kara <[email protected]>
> SUSE Labs, CR
>