2012-06-05 09:16:46

by Ondrej Zary

[permalink] [raw]
Subject: [bisected] NFS corruption with 3.4

Hello,
I use NFS for deploying HDD images on new machines. My machine has 2nd network
card just for this, running DHCPD, TFTPD and kernel NFS server. The target
machine is set to boot from LAN and boots SystemRescueCD from my machine with
an autorun script that launches Partimage and deploys the HDD image (400 to
900 MB compressed).

It worked fine for years, until now. With kernel 3.4, everyting
works only for the first time after boot (and not always). Next time (next
machine), partimage aborts almost immediately as it's probably unable to
decompress the image file. md5sum is different on my machine vs. on the
target (through NFS). Also SystemRescueCD boot aborts with md5 error
sometimes. Everything works fine after rebooting back to 3.3.

Bisection found this:

0fc9d1040313047edf6a39fd4d7c7defdca97c62 is the first bad commit
commit 0fc9d1040313047edf6a39fd4d7c7defdca97c62
Author: Konstantin Khlebnikov <[email protected]>
Date: Wed Mar 28 14:42:54 2012 -0700

radix-tree: use iterators in find_get_pages* functions

Reverting this commit in 3.4 fixes the problem.


--
Ondrej Zary


2012-06-05 12:46:12

by Dave Jones

[permalink] [raw]
Subject: Re: [bisected] NFS corruption with 3.4

On Tue, Jun 05, 2012 at 11:16:17AM +0200, Ondrej Zary wrote:
> Hello,
> I use NFS for deploying HDD images on new machines. My machine has 2nd network
> card just for this, running DHCPD, TFTPD and kernel NFS server. The target
> machine is set to boot from LAN and boots SystemRescueCD from my machine with
> an autorun script that launches Partimage and deploys the HDD image (400 to
> 900 MB compressed).
>
> It worked fine for years, until now. With kernel 3.4, everyting
> works only for the first time after boot (and not always). Next time (next
> machine), partimage aborts almost immediately as it's probably unable to
> decompress the image file. md5sum is different on my machine vs. on the
> target (through NFS). Also SystemRescueCD boot aborts with md5 error
> sometimes. Everything works fine after rebooting back to 3.3.
>
> Bisection found this:
>
> 0fc9d1040313047edf6a39fd4d7c7defdca97c62 is the first bad commit
> commit 0fc9d1040313047edf6a39fd4d7c7defdca97c62
> Author: Konstantin Khlebnikov <[email protected]>
> Date: Wed Mar 28 14:42:54 2012 -0700
>
> radix-tree: use iterators in find_get_pages* functions
>
> Reverting this commit in 3.4 fixes the problem.

I meant to come back to this, because I saw this problem too.

is this patch a problem for the client, or the server ?
I'm assuming the server, because I saw at least a similar sounding
problem using an OSX client->Linux server.

Dave

2012-06-05 13:32:59

by Konstantin Khlebnikov

[permalink] [raw]
Subject: Re: [bisected commit 0fc9d10] NFS-server corruption with 3.4

Ondrej Zary wrote:
> Hello,
> I use NFS for deploying HDD images on new machines. My machine has 2nd network
> card just for this, running DHCPD, TFTPD and kernel NFS server. The target
> machine is set to boot from LAN and boots SystemRescueCD from my machine with
> an autorun script that launches Partimage and deploys the HDD image (400 to
> 900 MB compressed).
>
> It worked fine for years, until now. With kernel 3.4, everyting
> works only for the first time after boot (and not always). Next time (next
> machine), partimage aborts almost immediately as it's probably unable to
> decompress the image file. md5sum is different on my machine vs. on the
> target (through NFS). Also SystemRescueCD boot aborts with md5 error
> sometimes. Everything works fine after rebooting back to 3.3.
>
> Bisection found this:
>
> 0fc9d1040313047edf6a39fd4d7c7defdca97c62 is the first bad commit
> commit 0fc9d1040313047edf6a39fd4d7c7defdca97c62
> Author: Konstantin Khlebnikov<[email protected]>
> Date: Wed Mar 28 14:42:54 2012 -0700
>
> radix-tree: use iterators in find_get_pages* functions
>
> Reverting this commit in 3.4 fixes the problem.
>
>

[all reporters added to CC] let's keep all in one thread

In attachment two patches which might help to debug this regression:

"mm: recheck page index in find_get_pages_contig" adds paranoid check into find_get_pages_contig().
It can explain everything, but currently I don't see how this can hapens.

"mm: debug fing_get_pages speculative restart" shows lookup restarting condition
which was removed by bisected commit. It was checked by Hans, but unsuccessfully:

Hans de Bruin wrote:
> On 06/04/2012 12:31 PM, Konstantin Khlebnikov wrote:
>> Hans de Bruin wrote:
>>> On 06/01/2012 09:11 PM, Hans de Bruin wrote:
>>>> On 05/29/2012 12:19 AM, Hans de Bruin wrote:
>>>>> I just upgraded my home server from kernel 3.3.5 to 3.4.0 and ran into
>>>>> some trouble. My laptop, a nfsroot client, will not run firefox and
>>>>> thunderbird anymore. When I start these programs from an xterm, the
>>>>> cursor goes to the next line and waits indefinitely.
>>>>>
>>>>> I do not know if there is any order is lsof's output. A lsof | grep
>>>>> firefox or thunderbird shows ......./.parentlock as the last line.
>>>>>
>>>>> It does not matter whether the client is running a 3.4.0 or a 3.3.0
>>>>> kernel, or if the server is running on top of xen or not.
>>>>>
>>>>> There is some noise in the servers dmesg:
>>>>>
>>>>> [ 241.256684] INFO: task kworker/u:2:801 blocked for more than 120
>>>>> seconds.
>>>>> [ 241.256691] "echo 0> /proc/sys/kernel/hung_task_timeout_secs"
>>>>
>>>> ...
>>>>
>>>> On a almost identical testsystem firefox en thunderbird segfault after
>>>> upgrading to 3.4.0. I would have been nice if it would behave exaclty
>>>> like my home server. I bisected the segfault to:
>>>>
>>>> commit 0fc9d1040313047edf6a39fd4d7c7defdca97c62
>>>> Author: Konstantin Khlebnikov<[email protected]>
>>>> Date: Wed Mar 28 14:42:54 2012 -0700
>>>>
>>>> radix-tree: use iterators in find_get_pages* functions
>>>>
>>>>
>>>> When I revert that on top of 3.4.0 the segfaults are gone but both
>>>> firefox en thunderbird go in the lets wait indefinitely mode like the
>>>> homeserver.
>>>>
>>>> I am going to make a bit-wise copy from from my homeserver to my
>>>> testserver and try again.
>>>>
>>>
>>> The bit-wise copy also segfaults firefox and thunderbird at the same
>>> commit.
>>>
>>
>> I think bug somewhere in NFS, that patch only highlighted it.
>> Please, try to run it with debug patch from attachment.
>
> Before I can start firefox from an xterm the lines below are shown on
> the server:
>
> [ 241.260076] INFO: task kworker/u:2:791 blocked for more than 120 seconds.
> [ 241.260084] "echo 0> /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 241.260090] kworker/u:2 D 000000000000000c 0 791 2
> 0x00000000
> [ 241.260102] ffff8801390b1cf0 0000000000000046 0000000000012d00
> 0000000000012d00
> [ 241.260113] 0000000000012d00 ffff880139141470 0000000000012d00
> ffff8801390b1fd8
> [ 241.260124] ffff8801390b1fd8 0000000000012d00 ffff880139cdc420
> ffff880139141470
> [ 241.260135] Call Trace:
> [ 241.260152] [<ffffffff81513116>] schedule+0x64/0x66
> [ 241.260162] [<ffffffff812005a6>] cld_pipe_upcall+0x95/0xd1
> [ 241.260173] [<ffffffff811faae5>] ? nfsd4_exchange_id+0x23e/0x23e
> [ 241.260182] [<ffffffff81200a5e>] nfsd4_cld_grace_done+0x50/0x8a
> [ 241.260191] [<ffffffff81200f8b>] nfsd4_record_grace_done+0x18/0x1a
> [ 241.260200] [<ffffffff811fab2f>] laundromat_main+0x4a/0x213
> [ 241.260210] [<ffffffff81069aeb>] ? need_resched+0x1e/0x28
> [ 241.260218] [<ffffffff81513035>] ? __schedule+0x49d/0x4b5
> [ 241.260227] [<ffffffff811faae5>] ? nfsd4_exchange_id+0x23e/0x23e
> [ 241.260237] [<ffffffff8105b8ad>] process_one_work+0x190/0x28d
> [ 241.260248] [<ffffffff8105c4e7>] worker_thread+0x105/0x189
> [ 241.260260] [<ffffffff81513a75>] ? _raw_spin_unlock_irqrestore+0x1a/0x1d
> [ 241.260274] [<ffffffff8105c3e2>] ? manage_workers.clone.17+0x173/0x173
> [ 241.260287] [<ffffffff8105ff30>] kthread+0x8a/0x92
> [ 241.260325] [<ffffffff815158a4>] kernel_thread_helper+0x4/0x10
> [ 241.260335] [<ffffffff8105fea6>] ?
> kthread_freezable_should_stop+0x47/0x47
> [ 241.260343] [<ffffffff815158a0>] ? gs_change+0x13/0x13
> [ 361.260025] INFO: task kworker/u:2:791 blocked for more than 120 seconds.
> [ 361.260032] "echo 0> /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 361.260039] kworker/u:2 D 000000000000000c 0 791 2
> 0x00000000
> [ 361.260051] ffff8801390b1cf0 0000000000000046 0000000000012d00
> 0000000000012d00
> [ 361.260062] 0000000000012d00 ffff880139141470 0000000000012d00
> ffff8801390b1fd8
> [ 361.260072] ffff8801390b1fd8 0000000000012d00 ffff880139cdc420
> ffff880139141470
> [ 361.260084] Call Trace:
> [ 361.260099] [<ffffffff81513116>] schedule+0x64/0x66
> [ 361.260110] [<ffffffff812005a6>] cld_pipe_upcall+0x95/0xd1
> [ 361.260121] [<ffffffff811faae5>] ? nfsd4_exchange_id+0x23e/0x23e
> [ 361.260130] [<ffffffff81200a5e>] nfsd4_cld_grace_done+0x50/0x8a
> [ 361.260139] [<ffffffff81200f8b>] nfsd4_record_grace_done+0x18/0x1a
> [ 361.260148] [<ffffffff811fab2f>] laundromat_main+0x4a/0x213
> [ 361.260158] [<ffffffff81069aeb>] ? need_resched+0x1e/0x28
> [ 361.260166] [<ffffffff81513035>] ? __schedule+0x49d/0x4b5
> [ 361.260175] [<ffffffff811faae5>] ? nfsd4_exchange_id+0x23e/0x23e
> [ 361.260185] [<ffffffff8105b8ad>] process_one_work+0x190/0x28d
> [ 361.260194] [<ffffffff8105c4e7>] worker_thread+0x105/0x189
> [ 361.260203] [<ffffffff81513a75>] ? _raw_spin_unlock_irqrestore+0x1a/0x1d
> [ 361.260213] [<ffffffff8105c3e2>] ? manage_workers.clone.17+0x173/0x173
> [ 361.260222] [<ffffffff8105ff30>] kthread+0x8a/0x92
> [ 361.260231] [<ffffffff815158a4>] kernel_thread_helper+0x4/0x10
> [ 361.260241] [<ffffffff8105fea6>] ?
> kthread_freezable_should_stop+0x47/0x47
> [ 361.260249] [<ffffffff815158a0>] ? gs_change+0x13/0x13
> [ 481.260010] INFO: task kworker/u:2:791 blocked for more than 120 seconds.
> [ 481.260019] "echo 0> /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 481.260028] kworker/u:2 D 000000000000000c 0 791 2
> 0x00000000
> [ 481.260043] ffff8801390b1cf0 0000000000000046 0000000000012d00
> 0000000000012d00
> [ 481.260058] 0000000000012d00 ffff880139141470 0000000000012d00
> ffff8801390b1fd8
> [ 481.260073] ffff8801390b1fd8 0000000000012d00 ffff880139cdc420
> ffff880139141470
> [ 481.260088] Call Trace:
> [ 481.260107] [<ffffffff81513116>] schedule+0x64/0x66
> [ 481.260120] [<ffffffff812005a6>] cld_pipe_upcall+0x95/0xd1
> [ 481.260135] [<ffffffff811faae5>] ? nfsd4_exchange_id+0x23e/0x23e
> [ 481.260147] [<ffffffff81200a5e>] nfsd4_cld_grace_done+0x50/0x8a
> [ 481.260159] [<ffffffff81200f8b>] nfsd4_record_grace_done+0x18/0x1a
> [ 481.260172] [<ffffffff811fab2f>] laundromat_main+0x4a/0x213
> [ 481.260185] [<ffffffff81069aeb>] ? need_resched+0x1e/0x28
> [ 481.260196] [<ffffffff81513035>] ? __schedule+0x49d/0x4b5
> [ 481.260206] [<ffffffff811faae5>] ? nfsd4_exchange_id+0x23e/0x23e
> [ 481.260215] [<ffffffff8105b8ad>] process_one_work+0x190/0x28d
> [ 481.260225] [<ffffffff8105c4e7>] worker_thread+0x105/0x189
> [ 481.260234] [<ffffffff81513a75>] ? _raw_spin_unlock_irqrestore+0x1a/0x1d
> [ 481.260243] [<ffffffff8105c3e2>] ? manage_workers.clone.17+0x173/0x173
> [ 481.260252] [<ffffffff8105ff30>] kthread+0x8a/0x92
> [ 481.260262] [<ffffffff815158a4>] kernel_thread_helper+0x4/0x10
> [ 481.260271] [<ffffffff8105fea6>] ?
> kthread_freezable_should_stop+0x47/0x47
> [ 481.260279] [<ffffffff815158a0>] ? gs_change+0x13/0x13
>
>
> dmesg on the client side:
>
> [ 27.607606] gtk-query-immod[1976]: segfault at 1d2d1f30 ip b7734391
> sp bfe3e984 error 4 in ld-2.13.so[b772b000+1d000]
> [ 48.136763] start_kdeinit (2086): /proc/2086/oom_adj is deprecated,
> please use /proc/2086/oom_score_adj instead.
> [ 75.801804] blueman-applet[2150]: segfault at 1cf2cf30 ip b7741391 sp
> bfb456b8 error 4 in ld-2.13.so[b7738000+1d000]
> [ 140.226371] firefox[2175]: segfault at 1b065f30 ip b76f6391 sp
> bfb15db8 error 4 in ld-2.13.so[b76ed000+1d000]
>
>
> The firefox dump on client side produces no messages on server side.
>
> md5sum's of ld-2.13.so are equal on server and client and across
> kernlversions.
>
>
>
> Did I miss the output off the debug patch?
>
>


Attachments:
mm-debug-fing_get_pages-speculative-restart (1.75 kB)
mm-recheck-page-index-in-find_get_pages_contig (477.00 B)
Download all attachments

2012-06-05 13:46:07

by Holger Hoffstätte

[permalink] [raw]
Subject: Re: [bisected] NFS corruption with 3.4

On Tue, 05 Jun 2012 08:45:37 -0400, Dave Jones wrote:

> > It worked fine for years, until now. With kernel 3.4, everyting works
> > only for the first time after boot (and not always). Next time (next
> > machine), partimage aborts almost immediately as it's probably unable
> > to decompress the image file. md5sum is different on my machine vs. on
> > the target (through NFS). Also SystemRescueCD boot aborts with md5
> > error sometimes. Everything works fine after rebooting back to 3.3.
> >
> > Bisection found this:
> >
> > 0fc9d1040313047edf6a39fd4d7c7defdca97c62 is the first bad commit commit
> > 0fc9d1040313047edf6a39fd4d7c7defdca97c62 Author: Konstantin Khlebnikov
> > <[email protected]> Date: Wed Mar 28 14:42:54 2012 -0700
> >
> > radix-tree: use iterators in find_get_pages* functions
> >
> > Reverting this commit in 3.4 fixes the problem.
>
> I meant to come back to this, because I saw this problem too.

Same here, seen just yesterday.

> is this patch a problem for the client, or the server ? I'm assuming the

In my case I tried to unpack a remote kernel tarball locally to a client
and suddenly got gzip/tar checksum/EOF errors, which repeatably didn't
show up when unpacking said archive directly on the server. Somewhat
confused I re-created a fresh tarball, which then unpacked fine on the
client. Looks like this is a pagecache race/staleness issue.

-h

2012-06-05 14:11:38

by Ondrej Zary

[permalink] [raw]
Subject: Re: [bisected] NFS corruption with 3.4

On Tuesday 05 June 2012, Dave Jones wrote:
> On Tue, Jun 05, 2012 at 11:16:17AM +0200, Ondrej Zary wrote:
> > Hello,
> > I use NFS for deploying HDD images on new machines. My machine has 2nd
> > network card just for this, running DHCPD, TFTPD and kernel NFS server.
> > The target machine is set to boot from LAN and boots SystemRescueCD from
> > my machine with an autorun script that launches Partimage and deploys
> > the HDD image (400 to 900 MB compressed).
> >
> > It worked fine for years, until now. With kernel 3.4, everyting
> > works only for the first time after boot (and not always). Next time
> > (next machine), partimage aborts almost immediately as it's probably
> > unable to decompress the image file. md5sum is different on my machine
> > vs. on the target (through NFS). Also SystemRescueCD boot aborts with
> > md5 error sometimes. Everything works fine after rebooting back to 3.3.
> >
> > Bisection found this:
> >
> > 0fc9d1040313047edf6a39fd4d7c7defdca97c62 is the first bad commit
> > commit 0fc9d1040313047edf6a39fd4d7c7defdca97c62
> > Author: Konstantin Khlebnikov <[email protected]>
> > Date: Wed Mar 28 14:42:54 2012 -0700
> >
> > radix-tree: use iterators in find_get_pages* functions
> >
> > Reverting this commit in 3.4 fixes the problem.
>
> I meant to come back to this, because I saw this problem too.
>
> is this patch a problem for the client, or the server ?
> I'm assuming the server, because I saw at least a similar sounding
> problem using an OSX client->Linux server.
>
> Dave

Yes, this patch breaks Linux NFS server.

--
Ondrej Zary

2012-06-05 14:21:34

by Toralf Förster

[permalink] [raw]
Subject: Re: [bisected commit 0fc9d10] NFS-server corruption with 3.4

On 06/05/2012 03:32 PM, Konstantin Khlebnikov wrote:
>
> [all reporters added to CC] let's keep all in one thread
>
> In attachment two patches which might help to debug this regression:

Hi Konstantin,

the output after applying your 2 patches om top of current git tree is
shown below. FWIW reverting commit 0fc9d10 on top of 99becf1 solves the
issue for me.


2012-06-05T16:12:29.745+02:00 n22 kernel: ------------[ cut here
]------------
2012-06-05T16:12:29.746+02:00 n22 kernel: WARNING: at mm/filemap.c:940
find_get_pages_contig+0x16a/0x1a0()
2012-06-05T16:12:29.746+02:00 n22 kernel: Hardware name: 6474B84
2012-06-05T16:12:29.746+02:00 n22 kernel: Modules linked in: loop fuse
dm_mod usblp hid_generic hid_cherry i915 usbhid hid fbcon font bitblit
softcursor drm_kms_helper drm fb fbdev intel_agp cfbcopyarea
i2c_algo_bit intel_gtt 8250_pci 8250 cfbimgblt i2c_i801 sr_mod
thinkpad_acpi hwmon video acpi_cpufreq nvram ac thermal sg evdev psmouse
serial_core i2c_core cdrom agpgart button battery mperf processor wmi
cfbfillrect [last unloaded: microcode]
2012-06-05T16:12:29.746+02:00 n22 kernel: Pid: 2359, comm: loop0 Not
tainted 3.5.0-rc1-00037-g99becf1-dirty #21
2012-06-05T16:12:29.746+02:00 n22 kernel: Call Trace:
2012-06-05T16:12:29.746+02:00 n22 kernel: [<c102e632>]
warn_slowpath_common+0x72/0xa0
2012-06-05T16:12:29.746+02:00 n22 kernel: [<c10c07aa>] ?
find_get_pages_contig+0x16a/0x1a0
2012-06-05T16:12:29.746+02:00 n22 kernel: [<c10c07aa>] ?
find_get_pages_contig+0x16a/0x1a0
2012-06-05T16:12:29.746+02:00 n22 kernel: [<c102e682>]
warn_slowpath_null+0x22/0x30
2012-06-05T16:12:29.746+02:00 n22 kernel: [<c10c07aa>]
find_get_pages_contig+0x16a/0x1a0
2012-06-05T16:12:29.746+02:00 n22 kernel: [<c11240fc>] ?
splice_to_pipe+0xec/0x1f0
2012-06-05T16:12:29.746+02:00 n22 kernel: [<c1124907>]
__generic_file_splice_read+0xe7/0x550
2012-06-05T16:12:29.746+02:00 n22 kernel: [<c1062508>] ?
load_balance+0x88/0x5a0
2012-06-05T16:12:29.746+02:00 n22 kernel: [<c105e6b4>] ?
update_cfs_load+0x284/0x290
2012-06-05T16:12:29.746+02:00 n22 kernel: [<f8494db4>] ?
transfer_none+0x54/0x80 [loop]
2012-06-05T16:12:29.746+02:00 n22 kernel: [<f84945ba>] ?
lo_splice_actor+0xaa/0xe0 [loop]
2012-06-05T16:12:29.747+02:00 n22 kernel: [<c1123430>] ?
page_cache_pipe_buf_release+0x20/0x20
2012-06-05T16:12:29.747+02:00 n22 kernel: [<c1124e0c>]
generic_file_splice_read+0x9c/0x100
2012-06-05T16:12:29.747+02:00 n22 kernel: [<c1124d70>] ?
__generic_file_splice_read+0x550/0x550
2012-06-05T16:12:29.747+02:00 n22 kernel: [<c11238f5>]
do_splice_to+0x65/0x80
2012-06-05T16:12:29.747+02:00 n22 kernel: [<c1123b65>]
splice_direct_to_actor+0xb5/0x1f0
2012-06-05T16:12:29.747+02:00 n22 kernel: [<f8494c40>] ?
loop_thread+0x510/0x510 [loop]
2012-06-05T16:12:29.747+02:00 n22 kernel: [<f84948b1>]
loop_thread+0x181/0x510 [loop]
2012-06-05T16:12:29.747+02:00 n22 kernel: [<f84945f0>] ?
lo_splice_actor+0xe0/0xe0 [loop]
2012-06-05T16:12:29.747+02:00 n22 kernel: [<f8494730>] ?
do_lo_send_write+0xe0/0xe0 [loop]
2012-06-05T16:12:29.747+02:00 n22 kernel: [<c104cacc>] kthread+0x7c/0x90
2012-06-05T16:12:29.747+02:00 n22 kernel: [<c104ca50>] ?
flush_kthread_worker+0xb0/0xb0
2012-06-05T16:12:29.747+02:00 n22 kernel: [<c1379cbe>]
kernel_thread_helper+0x6/0x10
2012-06-05T16:12:29.747+02:00 n22 kernel: ---[ end trace
00cc617101264b7f ]---
2012-06-05T16:12:29.784+02:00 n22 kernel: ------------[ cut here
]------------
2012-06-05T16:12:29.785+02:00 n22 kernel: WARNING: at mm/filemap.c:940
find_get_pages_contig+0x16a/0x1a0()
2012-06-05T16:12:29.785+02:00 n22 kernel: Hardware name: 6474B84
2012-06-05T16:12:29.785+02:00 n22 kernel: Modules linked in: loop fuse
dm_mod usblp hid_generic hid_cherry i915 usbhid hid fbcon font bitblit
softcursor drm_kms_helper drm fb fbdev intel_agp cfbcopyarea
i2c_algo_bit intel_gtt 8250_pci 8250 cfbimgblt i2c_i801 sr_mod
thinkpad_acpi hwmon video acpi_cpufreq nvram ac thermal sg evdev psmouse
serial_core i2c_core cdrom agpgart button battery mperf processor wmi
cfbfillrect [last unloaded: microcode]
2012-06-05T16:12:29.785+02:00 n22 kernel: Pid: 2359, comm: loop0
Tainted: G W 3.5.0-rc1-00037-g99becf1-dirty #21
2012-06-05T16:12:29.785+02:00 n22 kernel: Call Trace:
2012-06-05T16:12:29.785+02:00 n22 kernel: [<c102e632>]
warn_slowpath_common+0x72/0xa0
2012-06-05T16:12:29.785+02:00 n22 kernel: [<c10c07aa>] ?
find_get_pages_contig+0x16a/0x1a0
2012-06-05T16:12:29.785+02:00 n22 kernel: [<c10c07aa>] ?
find_get_pages_contig+0x16a/0x1a0
2012-06-05T16:12:29.785+02:00 n22 kernel: [<c102e682>]
warn_slowpath_null+0x22/0x30
2012-06-05T16:12:29.785+02:00 n22 kernel: [<c10c07aa>]
find_get_pages_contig+0x16a/0x1a0
2012-06-05T16:12:29.785+02:00 n22 kernel: [<c11240fc>] ?
splice_to_pipe+0xec/0x1f0
2012-06-05T16:12:29.785+02:00 n22 kernel: [<c1124907>]
__generic_file_splice_read+0xe7/0x550
2012-06-05T16:12:29.785+02:00 n22 kernel: [<c10610d5>] ?
enqueue_entity+0xd5/0x210
2012-06-05T16:12:29.785+02:00 n22 kernel: [<c10599e5>] ?
check_preempt_curr+0x65/0x90
2012-06-05T16:12:29.785+02:00 n22 kernel: [<c1059a40>] ?
ttwu_do_wakeup+0x30/0x110
2012-06-05T16:12:29.785+02:00 n22 kernel: [<f8494db4>] ?
transfer_none+0x54/0x80 [loop]
2012-06-05T16:12:29.785+02:00 n22 kernel: [<f84945ba>] ?
lo_splice_actor+0xaa/0xe0 [loop]
2012-06-05T16:12:29.785+02:00 n22 kernel: [<c1123430>] ?
page_cache_pipe_buf_release+0x20/0x20
2012-06-05T16:12:29.785+02:00 n22 kernel: [<c1124e0c>]
generic_file_splice_read+0x9c/0x100
2012-06-05T16:12:29.786+02:00 n22 kernel: [<c1124d70>] ?
__generic_file_splice_read+0x550/0x550
2012-06-05T16:12:29.786+02:00 n22 kernel: [<c11238f5>]
do_splice_to+0x65/0x80
2012-06-05T16:12:29.786+02:00 n22 kernel: [<c1123b65>]
splice_direct_to_actor+0xb5/0x1f0
2012-06-05T16:12:29.786+02:00 n22 kernel: [<f8494c40>] ?
loop_thread+0x510/0x510 [loop]
2012-06-05T16:12:29.786+02:00 n22 kernel: [<f84948b1>]
loop_thread+0x181/0x510 [loop]
2012-06-05T16:12:29.786+02:00 n22 kernel: [<f84945f0>] ?
lo_splice_actor+0xe0/0xe0 [loop]
2012-06-05T16:12:29.786+02:00 n22 kernel: [<f8494730>] ?
do_lo_send_write+0xe0/0xe0 [loop]
2012-06-05T16:12:29.786+02:00 n22 kernel: [<c104cacc>] kthread+0x7c/0x90
2012-06-05T16:12:29.786+02:00 n22 kernel: [<c104ca50>] ?
flush_kthread_worker+0xb0/0xb0
2012-06-05T16:12:29.786+02:00 n22 kernel: [<c1379cbe>]
kernel_thread_helper+0x6/0x10
2012-06-05T16:12:29.786+02:00 n22 kernel: ---[ end trace
00cc617101264b80 ]---
2012-06-05T16:12:29.801+02:00 n22 kernel: ------------[ cut here
]------------
2012-06-05T16:12:29.802+02:00 n22 kernel: WARNING: at mm/filemap.c:940
find_get_pages_contig+0x16a/0x1a0()
2012-06-05T16:12:29.802+02:00 n22 kernel: Hardware name: 6474B84
2012-06-05T16:12:29.802+02:00 n22 kernel: Modules linked in: loop fuse
dm_mod usblp hid_generic hid_cherry i915 usbhid hid fbcon font bitblit
softcursor drm_kms_helper drm fb fbdev intel_agp cfbcopyarea
i2c_algo_bit intel_gtt 8250_pci 8250 cfbimgblt i2c_i801 sr_mod
thinkpad_acpi hwmon video acpi_cpufreq nvram ac thermal sg evdev psmouse
serial_core i2c_core cdrom agpgart button battery mperf processor wmi
cfbfillrect [last unloaded: microcode]
2012-06-05T16:12:29.802+02:00 n22 kernel: Pid: 2359, comm: loop0
Tainted: G W 3.5.0-rc1-00037-g99becf1-dirty #21
2012-06-05T16:12:29.802+02:00 n22 kernel: Call Trace:
2012-06-05T16:12:29.802+02:00 n22 kernel: [<c102e632>]
warn_slowpath_common+0x72/0xa0
2012-06-05T16:12:29.802+02:00 n22 kernel: [<c10c07aa>] ?
find_get_pages_contig+0x16a/0x1a0
2012-06-05T16:12:29.802+02:00 n22 kernel: [<c10c07aa>] ?
find_get_pages_contig+0x16a/0x1a0
2012-06-05T16:12:29.802+02:00 n22 kernel: [<c102e682>]
warn_slowpath_null+0x22/0x30
2012-06-05T16:12:29.802+02:00 n22 kernel: [<c10c07aa>]
find_get_pages_contig+0x16a/0x1a0
2012-06-05T16:12:29.802+02:00 n22 kernel: [<c11240fc>] ?
splice_to_pipe+0xec/0x1f0
2012-06-05T16:12:29.802+02:00 n22 kernel: [<c1124907>]
__generic_file_splice_read+0xe7/0x550
2012-06-05T16:12:29.802+02:00 n22 kernel: [<c1062508>] ?
load_balance+0x88/0x5a0
2012-06-05T16:12:29.802+02:00 n22 kernel: [<c105e6b4>] ?
update_cfs_load+0x284/0x290
2012-06-05T16:12:29.802+02:00 n22 kernel: [<f8494db4>] ?
transfer_none+0x54/0x80 [loop]
2012-06-05T16:12:29.802+02:00 n22 kernel: [<f84945ba>] ?
lo_splice_actor+0xaa/0xe0 [loop]
2012-06-05T16:12:29.802+02:00 n22 kernel: [<c1123430>] ?
page_cache_pipe_buf_release+0x20/0x20
2012-06-05T16:12:29.802+02:00 n22 kernel: [<c1124e0c>]
generic_file_splice_read+0x9c/0x100
2012-06-05T16:12:29.802+02:00 n22 kernel: [<c1124d70>] ?
__generic_file_splice_read+0x550/0x550
2012-06-05T16:12:29.803+02:00 n22 kernel: [<c11238f5>]
do_splice_to+0x65/0x80
2012-06-05T16:12:29.803+02:00 n22 kernel: [<c1123b65>]
splice_direct_to_actor+0xb5/0x1f0
2012-06-05T16:12:29.803+02:00 n22 kernel: [<f8494c40>] ?
loop_thread+0x510/0x510 [loop]
2012-06-05T16:12:29.803+02:00 n22 kernel: [<f84948b1>]
loop_thread+0x181/0x510 [loop]
2012-06-05T16:12:29.803+02:00 n22 kernel: [<f84945f0>] ?
lo_splice_actor+0xe0/0xe0 [loop]
2012-06-05T16:12:29.803+02:00 n22 kernel: [<f8494730>] ?
do_lo_send_write+0xe0/0xe0 [loop]
2012-06-05T16:12:29.803+02:00 n22 kernel: [<c104cacc>] kthread+0x7c/0x90
2012-06-05T16:12:29.803+02:00 n22 kernel: [<c104ca50>] ?
flush_kthread_worker+0xb0/0xb0
2012-06-05T16:12:29.803+02:00 n22 kernel: [<c1379cbe>]
kernel_thread_helper+0x6/0x10
2012-06-05T16:12:29.803+02:00 n22 kernel: ---[ end trace
00cc617101264b81 ]---
2012-06-05T16:12:30.038+02:00 n22 kernel: ------------[ cut here
]------------
2012-06-05T16:12:30.039+02:00 n22 kernel: WARNING: at mm/filemap.c:940
find_get_pages_contig+0x16a/0x1a0()
2012-06-05T16:12:30.039+02:00 n22 kernel: Hardware name: 6474B84
2012-06-05T16:12:30.039+02:00 n22 kernel: Modules linked in: loop fuse
dm_mod usblp hid_generic hid_cherry i915 usbhid hid fbcon font bitblit
softcursor drm_kms_helper drm fb fbdev intel_agp cfbcopyarea
i2c_algo_bit intel_gtt 8250_pci 8250 cfbimgblt i2c_i801 sr_mod
thinkpad_acpi hwmon video acpi_cpufreq nvram ac thermal sg evdev psmouse
serial_core i2c_core cdrom agpgart button battery mperf processor wmi
cfbfillrect [last unloaded: microcode]
2012-06-05T16:12:30.039+02:00 n22 kernel: Pid: 2359, comm: loop0
Tainted: G W 3.5.0-rc1-00037-g99becf1-dirty #21
2012-06-05T16:12:30.039+02:00 n22 kernel: Call Trace:
2012-06-05T16:12:30.039+02:00 n22 kernel: [<c102e632>]
warn_slowpath_common+0x72/0xa0
2012-06-05T16:12:30.039+02:00 n22 kernel: [<c10c07aa>] ?
find_get_pages_contig+0x16a/0x1a0
2012-06-05T16:12:30.039+02:00 n22 kernel: [<c10c07aa>] ?
find_get_pages_contig+0x16a/0x1a0
2012-06-05T16:12:30.039+02:00 n22 kernel: [<c102e682>]
warn_slowpath_null+0x22/0x30
2012-06-05T16:12:30.039+02:00 n22 kernel: [<c10c07aa>]
find_get_pages_contig+0x16a/0x1a0
2012-06-05T16:12:30.039+02:00 n22 kernel: [<c11240fc>] ?
splice_to_pipe+0xec/0x1f0
2012-06-05T16:12:30.039+02:00 n22 kernel: [<c1124907>]
__generic_file_splice_read+0xe7/0x550
2012-06-05T16:12:30.039+02:00 n22 kernel: [<c106ec6f>] ? ktime_get+0x5f/0xe0
2012-06-05T16:12:30.039+02:00 n22 kernel: [<c105182e>] ?
hrtimer_interrupt+0x16e/0x2a0
2012-06-05T16:12:30.039+02:00 n22 kernel: [<c1035e5c>] ? irq_exit+0x5c/0xa0
2012-06-05T16:12:30.039+02:00 n22 kernel: [<c1379e49>] ?
smp_apic_timer_interrupt+0x59/0x88
2012-06-05T16:12:30.039+02:00 n22 kernel: [<c1062508>] ?
load_balance+0x88/0x5a0
2012-06-05T16:12:30.039+02:00 n22 kernel: [<c13792cd>] ?
apic_timer_interrupt+0x31/0x38
2012-06-05T16:12:30.040+02:00 n22 kernel: [<c120d97f>] ? memcpy+0x1f/0x40
2012-06-05T16:12:30.040+02:00 n22 kernel: [<f8494db4>] ?
transfer_none+0x54/0x80 [loop]
2012-06-05T16:12:30.040+02:00 n22 kernel: [<f84945ba>] ?
lo_splice_actor+0xaa/0xe0 [loop]
2012-06-05T16:12:30.040+02:00 n22 kernel: [<c1123430>] ?
page_cache_pipe_buf_release+0x20/0x20
2012-06-05T16:12:30.040+02:00 n22 kernel: [<c1124e0c>]
generic_file_splice_read+0x9c/0x100
2012-06-05T16:12:30.040+02:00 n22 kernel: [<c1124d70>] ?
__generic_file_splice_read+0x550/0x550
2012-06-05T16:12:30.040+02:00 n22 kernel: [<c11238f5>]
do_splice_to+0x65/0x80
2012-06-05T16:12:30.040+02:00 n22 kernel: [<c1123b65>]
splice_direct_to_actor+0xb5/0x1f0
2012-06-05T16:12:30.040+02:00 n22 kernel: [<f8494c40>] ?
loop_thread+0x510/0x510 [loop]
2012-06-05T16:12:30.040+02:00 n22 kernel: [<f84948b1>]
loop_thread+0x181/0x510 [loop]
2012-06-05T16:12:30.040+02:00 n22 kernel: [<f84945f0>] ?
lo_splice_actor+0xe0/0xe0 [loop]
2012-06-05T16:12:30.040+02:00 n22 kernel: [<f8494730>] ?
do_lo_send_write+0xe0/0xe0 [loop]
2012-06-05T16:12:30.040+02:00 n22 kernel: [<c104cacc>] kthread+0x7c/0x90
2012-06-05T16:12:30.040+02:00 n22 kernel: [<c104ca50>] ?
flush_kthread_worker+0xb0/0xb0
2012-06-05T16:12:30.040+02:00 n22 kernel: [<c1379cbe>]
kernel_thread_helper+0x6/0x10
2012-06-05T16:12:30.040+02:00 n22 kernel: ---[ end trace
00cc617101264b82 ]---


--
MfG/Sincerely
Toralf F?rster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3

2012-06-05 14:21:40

by Ondrej Zary

[permalink] [raw]
Subject: Re: [bisected commit 0fc9d10] NFS-server corruption with 3.4

On Tuesday 05 June 2012, Konstantin Khlebnikov wrote:
> Ondrej Zary wrote:
> > Hello,
> > I use NFS for deploying HDD images on new machines. My machine has 2nd
> > network card just for this, running DHCPD, TFTPD and kernel NFS server.
> > The target machine is set to boot from LAN and boots SystemRescueCD from
> > my machine with an autorun script that launches Partimage and deploys the
> > HDD image (400 to 900 MB compressed).
> >
> > It worked fine for years, until now. With kernel 3.4, everyting
> > works only for the first time after boot (and not always). Next time
> > (next machine), partimage aborts almost immediately as it's probably
> > unable to decompress the image file. md5sum is different on my machine
> > vs. on the target (through NFS). Also SystemRescueCD boot aborts with md5
> > error sometimes. Everything works fine after rebooting back to 3.3.
> >
> > Bisection found this:
> >
> > 0fc9d1040313047edf6a39fd4d7c7defdca97c62 is the first bad commit
> > commit 0fc9d1040313047edf6a39fd4d7c7defdca97c62
> > Author: Konstantin Khlebnikov<[email protected]>
> > Date: Wed Mar 28 14:42:54 2012 -0700
> >
> > radix-tree: use iterators in find_get_pages* functions
> >
> > Reverting this commit in 3.4 fixes the problem.
>
> [all reporters added to CC] let's keep all in one thread
>
> In attachment two patches which might help to debug this regression:
>
> "mm: recheck page index in find_get_pages_contig" adds paranoid check into
> find_get_pages_contig(). It can explain everything, but currently I don't
> see how this can hapens.
>
> "mm: debug fing_get_pages speculative restart" shows lookup restarting
> condition which was removed by bisected commit.

My dmesg (after corruption occured) with these two patches applied:

[ 79.999511] ------------[ cut here ]------------
[ 79.999564] WARNING: at mm/filemap.c:941 find_get_pages_contig+0x177/0x1b0()
[ 79.999611] Hardware name: VT82C694X
[ 79.999617] Modules linked in: nfsd lockd sunrpc des_generic ecb crypto_blkcipher md4 md5 hmac cryptomgr aead cifs crypto_hash crypto_algapi crypto
firewire_ohci firewire_core
[ 79.999653] Pid: 1563, comm: nfsd Not tainted 3.4.0-omega #4
[ 79.999659] Call Trace:
[ 79.999729] [<c011ff88>] ? warn_slowpath_common+0x78/0xb0
[ 79.999744] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 79.999753] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 79.999763] [<c011ffd9>] ? warn_slowpath_null+0x19/0x20
[ 79.999772] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 79.999805] [<c01c544b>] ? __generic_file_splice_read+0xeb/0x510
[ 79.999853] [<c01c4040>] ? page_cache_pipe_buf_release+0x10/0x10
[ 79.999873] [<c04f2589>] ? common_interrupt+0x29/0x30
[ 79.999900] [<f892c710>] ? _fh_update.isra.11.part.12+0x60/0x60 [nfsd]
[ 79.999931] [<c022c9f7>] ? exportfs_decode_fh+0xc7/0x250
[ 79.999981] [<f893133d>] ? exp_get_by_name+0x3d/0x70 [nfsd]
[ 80.000000] [<c0150215>] ? getboottime+0x35/0x40
[ 80.007383] [<c04f0da8>] ? __schedule+0x198/0x470
[ 80.007505] [<f88cbf34>] ? sunrpc_cache_lookup+0x54/0x2d0 [sunrpc]
[ 80.007574] [<c01c58e3>] ? generic_file_splice_read+0x73/0x110
[ 80.007590] [<c01254bf>] ? irq_exit+0x4f/0x90
[ 80.007599] [<c01c5870>] ? __generic_file_splice_read+0x510/0x510
[ 80.007608] [<c01c4330>] ? do_splice_to+0x60/0x90
[ 80.007618] [<c01c459a>] ? splice_direct_to_actor+0xaa/0x1c0
[ 80.007654] [<f892d710>] ? nfsd_buffered_filldir+0x160/0x160 [nfsd]
[ 80.007700] [<f892dc37>] ? nfsd_vfs_read.isra.16+0x117/0x160 [nfsd]
[ 80.007715] [<f892e764>] ? nfsd_read+0x1c4/0x280 [nfsd]
[ 80.007732] [<f89357bf>] ? nfsd3_proc_read+0xcf/0x160 [nfsd]
[ 80.007745] [<f892a7d0>] ? nfsd_dispatch+0xb0/0x190 [nfsd]
[ 80.007779] [<f88c3682>] ? svc_process+0x442/0x7c0 [sunrpc]
[ 80.007825] [<f892a0a3>] ? nfsd+0xa3/0x130 [nfsd]
[ 80.007838] [<f892a000>] ? 0xf8929fff
[ 80.007846] [<f892a000>] ? 0xf8929fff
[ 80.007858] [<c01389bc>] ? kthread+0x6c/0x80
[ 80.007867] [<c0138950>] ? kthread_freezable_should_stop+0x50/0x50
[ 80.007896] [<c04f2596>] ? kernel_thread_helper+0x6/0xd
[ 80.007937] ---[ end trace 0bc8170cf5ac5466 ]---
[ 80.007944] ------------[ cut here ]------------
[ 80.007966] WARNING: at mm/filemap.c:941 find_get_pages_contig+0x177/0x1b0()
[ 80.007973] Hardware name: VT82C694X
[ 80.007977] Modules linked in: nfsd lockd sunrpc des_generic ecb crypto_blkcipher md4 md5 hmac cryptomgr aead cifs crypto_hash crypto_algapi crypto
firewire_ohci firewire_core
[ 80.014458] Pid: 1563, comm: nfsd Tainted: G W 3.4.0-omega #4
[ 80.014547] Call Trace:
[ 80.014580] [<c011ff88>] ? warn_slowpath_common+0x78/0xb0
[ 80.014603] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 80.014612] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 80.014624] [<c011ffd9>] ? warn_slowpath_null+0x19/0x20
[ 80.014676] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 80.014697] [<c01c544b>] ? __generic_file_splice_read+0xeb/0x510
[ 80.014709] [<c01c4040>] ? page_cache_pipe_buf_release+0x10/0x10
[ 80.014727] [<c04f2589>] ? common_interrupt+0x29/0x30
[ 80.014767] [<f892c710>] ? _fh_update.isra.11.part.12+0x60/0x60 [nfsd]
[ 80.014857] [<c022c9f7>] ? exportfs_decode_fh+0xc7/0x250
[ 80.014886] [<f893133d>] ? exp_get_by_name+0x3d/0x70 [nfsd]
[ 80.014939] [<c0150215>] ? getboottime+0x35/0x40
[ 80.014950] [<c04f0da8>] ? __schedule+0x198/0x470
[ 80.015030] [<f88cbf34>] ? sunrpc_cache_lookup+0x54/0x2d0 [sunrpc]
[ 80.015080] [<c01c58e3>] ? generic_file_splice_read+0x73/0x110
[ 80.015095] [<c01254bf>] ? irq_exit+0x4f/0x90
[ 80.015104] [<c01c5870>] ? __generic_file_splice_read+0x510/0x510
[ 80.015113] [<c01c4330>] ? do_splice_to+0x60/0x90
[ 80.015162] [<c01c459a>] ? splice_direct_to_actor+0xaa/0x1c0
[ 80.015177] [<f892d710>] ? nfsd_buffered_filldir+0x160/0x160 [nfsd]
[ 80.015190] [<f892dc37>] ? nfsd_vfs_read.isra.16+0x117/0x160 [nfsd]
[ 80.015205] [<f892e764>] ? nfsd_read+0x1c4/0x280 [nfsd]
[ 80.015221] [<f89357bf>] ? nfsd3_proc_read+0xcf/0x160 [nfsd]
[ 80.015234] [<f892a7d0>] ? nfsd_dispatch+0xb0/0x190 [nfsd]
[ 80.015305] [<f88c3682>] ? svc_process+0x442/0x7c0 [sunrpc]
[ 80.015318] [<f892a0a3>] ? nfsd+0xa3/0x130 [nfsd]
[ 80.015330] [<f892a000>] ? 0xf8929fff
[ 80.015337] [<f892a000>] ? 0xf8929fff
[ 80.015350] [<c01389bc>] ? kthread+0x6c/0x80
[ 80.015359] [<c0138950>] ? kthread_freezable_should_stop+0x50/0x50
[ 80.015415] [<c04f2596>] ? kernel_thread_helper+0x6/0xd
[ 80.015424] ---[ end trace 0bc8170cf5ac5467 ]---
[ 80.015430] ------------[ cut here ]------------
[ 80.015446] WARNING: at mm/filemap.c:941 find_get_pages_contig+0x177/0x1b0()
[ 80.015452] Hardware name: VT82C694X
[ 80.015456] Modules linked in: nfsd lockd sunrpc des_generic ecb crypto_blkcipher md4 md5 hmac cryptomgr aead cifs crypto_hash crypto_algapi crypto
firewire_ohci firewire_core
[ 80.015534] Pid: 1563, comm: nfsd Tainted: G W 3.4.0-omega #4
[ 80.015540] Call Trace:
[ 80.015560] [<c011ff88>] ? warn_slowpath_common+0x78/0xb0
[ 80.015570] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 80.015579] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 80.015588] [<c011ffd9>] ? warn_slowpath_null+0x19/0x20
[ 80.015597] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 80.015625] [<c01c544b>] ? __generic_file_splice_read+0xeb/0x510
[ 80.015672] [<c01c4040>] ? page_cache_pipe_buf_release+0x10/0x10
[ 80.015683] [<c04f2589>] ? common_interrupt+0x29/0x30
[ 80.015697] [<f892c710>] ? _fh_update.isra.11.part.12+0x60/0x60 [nfsd]
[ 80.015712] [<c022c9f7>] ? exportfs_decode_fh+0xc7/0x250
[ 80.015726] [<f893133d>] ? exp_get_by_name+0x3d/0x70 [nfsd]
[ 80.015782] [<c0150215>] ? getboottime+0x35/0x40
[ 80.015792] [<c04f0da8>] ? __schedule+0x198/0x470
[ 80.015829] [<f88cbf34>] ? sunrpc_cache_lookup+0x54/0x2d0 [sunrpc]
[ 80.015843] [<c01c58e3>] ? generic_file_splice_read+0x73/0x110
[ 80.015867] [<c01254bf>] ? irq_exit+0x4f/0x90
[ 80.015911] [<c01c5870>] ? __generic_file_splice_read+0x510/0x510
[ 80.015921] [<c01c4330>] ? do_splice_to+0x60/0x90
[ 80.015930] [<c01c459a>] ? splice_direct_to_actor+0xaa/0x1c0
[ 80.015945] [<f892d710>] ? nfsd_buffered_filldir+0x160/0x160 [nfsd]
[ 80.015958] [<f892dc37>] ? nfsd_vfs_read.isra.16+0x117/0x160 [nfsd]
[ 80.015972] [<f892e764>] ? nfsd_read+0x1c4/0x280 [nfsd]
[ 80.016083] [<f89357bf>] ? nfsd3_proc_read+0xcf/0x160 [nfsd]
[ 80.016096] [<f892a7d0>] ? nfsd_dispatch+0xb0/0x190 [nfsd]
[ 80.016168] [<f88c3682>] ? svc_process+0x442/0x7c0 [sunrpc]
[ 80.016182] [<f892a0a3>] ? nfsd+0xa3/0x130 [nfsd]
[ 80.016191] [<f892a000>] ? 0xf8929fff
[ 80.016198] [<f892a000>] ? 0xf8929fff
[ 80.016209] [<c01389bc>] ? kthread+0x6c/0x80
[ 80.016218] [<c0138950>] ? kthread_freezable_should_stop+0x50/0x50
[ 80.016278] [<c04f2596>] ? kernel_thread_helper+0x6/0xd
[ 80.016285] ---[ end trace 0bc8170cf5ac5468 ]---
[ 80.016291] ------------[ cut here ]------------
[ 80.016309] WARNING: at mm/filemap.c:941 find_get_pages_contig+0x177/0x1b0()
[ 80.016315] Hardware name: VT82C694X
[ 80.016319] Modules linked in: nfsd lockd sunrpc des_generic ecb crypto_blkcipher md4 md5 hmac cryptomgr aead cifs crypto_hash crypto_algapi crypto
firewire_ohci firewire_core
[ 80.016397] Pid: 1563, comm: nfsd Tainted: G W 3.4.0-omega #4
[ 80.016403] Call Trace:
[ 80.016420] [<c011ff88>] ? warn_slowpath_common+0x78/0xb0
[ 80.016430] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 80.016439] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 80.016449] [<c011ffd9>] ? warn_slowpath_null+0x19/0x20
[ 80.016458] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 80.016483] [<c01c544b>] ? __generic_file_splice_read+0xeb/0x510
[ 80.016529] [<c01c4040>] ? page_cache_pipe_buf_release+0x10/0x10
[ 80.016541] [<c04f2589>] ? common_interrupt+0x29/0x30
[ 80.016554] [<f892c710>] ? _fh_update.isra.11.part.12+0x60/0x60 [nfsd]
[ 80.016570] [<c022c9f7>] ? exportfs_decode_fh+0xc7/0x250
[ 80.016584] [<f893133d>] ? exp_get_by_name+0x3d/0x70 [nfsd]
[ 80.016645] [<c0150215>] ? getboottime+0x35/0x40
[ 80.016656] [<c04f0da8>] ? __schedule+0x198/0x470
[ 80.016688] [<f88cbf34>] ? sunrpc_cache_lookup+0x54/0x2d0 [sunrpc]
[ 80.016701] [<c01c58e3>] ? generic_file_splice_read+0x73/0x110
[ 80.016712] [<c01254bf>] ? irq_exit+0x4f/0x90
[ 80.016769] [<c01c5870>] ? __generic_file_splice_read+0x510/0x510
[ 80.016779] [<c01c4330>] ? do_splice_to+0x60/0x90
[ 80.016788] [<c01c459a>] ? splice_direct_to_actor+0xaa/0x1c0
[ 80.016802] [<f892d710>] ? nfsd_buffered_filldir+0x160/0x160 [nfsd]
[ 80.016816] [<f892dc37>] ? nfsd_vfs_read.isra.16+0x117/0x160 [nfsd]
[ 80.016829] [<f892e764>] ? nfsd_read+0x1c4/0x280 [nfsd]
[ 80.016851] [<f89357bf>] ? nfsd3_proc_read+0xcf/0x160 [nfsd]
[ 80.016890] [<f892a7d0>] ? nfsd_dispatch+0xb0/0x190 [nfsd]
[ 80.016915] [<f88c3682>] ? svc_process+0x442/0x7c0 [sunrpc]
[ 80.016927] [<f892a0a3>] ? nfsd+0xa3/0x130 [nfsd]
[ 80.016936] [<f892a000>] ? 0xf8929fff
[ 80.016943] [<f892a000>] ? 0xf8929fff
[ 80.016951] [<c01389bc>] ? kthread+0x6c/0x80
[ 80.016963] [<c0138950>] ? kthread_freezable_should_stop+0x50/0x50
[ 80.017018] [<c04f2596>] ? kernel_thread_helper+0x6/0xd
[ 80.017024] ---[ end trace 0bc8170cf5ac5469 ]---
[ 80.017030] ------------[ cut here ]------------
[ 80.017046] WARNING: at mm/filemap.c:941 find_get_pages_contig+0x177/0x1b0()
[ 80.017052] Hardware name: VT82C694X
[ 80.017056] Modules linked in: nfsd lockd sunrpc des_generic ecb crypto_blkcipher md4 md5 hmac cryptomgr aead cifs crypto_hash crypto_algapi crypto
firewire_ohci firewire_core
[ 80.017136] Pid: 1563, comm: nfsd Tainted: G W 3.4.0-omega #4
[ 80.017142] Call Trace:
[ 80.017158] [<c011ff88>] ? warn_slowpath_common+0x78/0xb0
[ 80.017169] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 80.017178] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 80.017188] [<c011ffd9>] ? warn_slowpath_null+0x19/0x20
[ 80.017197] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 80.017222] [<c01c544b>] ? __generic_file_splice_read+0xeb/0x510
[ 80.017285] [<c01c4040>] ? page_cache_pipe_buf_release+0x10/0x10
[ 80.017301] [<f892c710>] ? _fh_update.isra.11.part.12+0x60/0x60 [nfsd]
[ 80.017317] [<c022c9f7>] ? exportfs_decode_fh+0xc7/0x250
[ 80.017331] [<f893133d>] ? exp_get_by_name+0x3d/0x70 [nfsd]
[ 80.017342] [<c0150215>] ? getboottime+0x35/0x40
[ 80.017352] [<c04f0da8>] ? __schedule+0x198/0x470
[ 80.017391] [<f88cbf34>] ? sunrpc_cache_lookup+0x54/0x2d0 [sunrpc]
[ 80.017445] [<c01c58e3>] ? generic_file_splice_read+0x73/0x110
[ 80.017456] [<c01254bf>] ? irq_exit+0x4f/0x90
[ 80.017465] [<c01c5870>] ? __generic_file_splice_read+0x510/0x510
[ 80.017474] [<c01c4330>] ? do_splice_to+0x60/0x90
[ 80.017483] [<c01c459a>] ? splice_direct_to_actor+0xaa/0x1c0
[ 80.017538] [<f892d710>] ? nfsd_buffered_filldir+0x160/0x160 [nfsd]
[ 80.017551] [<f892dc37>] ? nfsd_vfs_read.isra.16+0x117/0x160 [nfsd]
[ 80.017659] [<f892e764>] ? nfsd_read+0x1c4/0x280 [nfsd]
[ 80.017675] [<f89357bf>] ? nfsd3_proc_read+0xcf/0x160 [nfsd]
[ 80.017687] [<f892a7d0>] ? nfsd_dispatch+0xb0/0x190 [nfsd]
[ 80.017713] [<f88c3682>] ? svc_process+0x442/0x7c0 [sunrpc]
[ 80.017725] [<f892a0a3>] ? nfsd+0xa3/0x130 [nfsd]
[ 80.017782] [<f892a000>] ? 0xf8929fff
[ 80.017790] [<f892a000>] ? 0xf8929fff
[ 80.017799] [<c01389bc>] ? kthread+0x6c/0x80
[ 80.017808] [<c0138950>] ? kthread_freezable_should_stop+0x50/0x50
[ 80.017821] [<c04f2596>] ? kernel_thread_helper+0x6/0xd
[ 80.017829] ---[ end trace 0bc8170cf5ac546a ]---
[ 80.017834] ------------[ cut here ]------------
[ 80.017851] WARNING: at mm/filemap.c:941 find_get_pages_contig+0x177/0x1b0()
[ 80.017900] Hardware name: VT82C694X
[ 80.017905] Modules linked in: nfsd lockd sunrpc des_generic ecb crypto_blkcipher md4 md5 hmac cryptomgr aead cifs crypto_hash crypto_algapi crypto
firewire_ohci firewire_core
[ 80.017936] Pid: 1563, comm: nfsd Tainted: G W 3.4.0-omega #4
[ 80.017941] Call Trace:
[ 80.017960] [<c011ff88>] ? warn_slowpath_common+0x78/0xb0
[ 80.017970] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 80.018029] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 80.018040] [<c011ffd9>] ? warn_slowpath_null+0x19/0x20
[ 80.018049] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 80.018062] [<c01c544b>] ? __generic_file_splice_read+0xeb/0x510
[ 80.018073] [<c01c4040>] ? page_cache_pipe_buf_release+0x10/0x10
[ 80.018087] [<f892c710>] ? _fh_update.isra.11.part.12+0x60/0x60 [nfsd]
[ 80.018114] [<c022c9f7>] ? exportfs_decode_fh+0xc7/0x250
[ 80.018165] [<f893133d>] ? exp_get_by_name+0x3d/0x70 [nfsd]
[ 80.018177] [<c0150215>] ? getboottime+0x35/0x40
[ 80.018189] [<c04f0da8>] ? __schedule+0x198/0x470
[ 80.018227] [<f88cbf34>] ? sunrpc_cache_lookup+0x54/0x2d0 [sunrpc]
[ 80.018283] [<c01c58e3>] ? generic_file_splice_read+0x73/0x110
[ 80.018296] [<c01254bf>] ? irq_exit+0x4f/0x90
[ 80.018305] [<c01c5870>] ? __generic_file_splice_read+0x510/0x510
[ 80.018314] [<c01c4330>] ? do_splice_to+0x60/0x90
[ 80.018323] [<c01c459a>] ? splice_direct_to_actor+0xaa/0x1c0
[ 80.018339] [<f892d710>] ? nfsd_buffered_filldir+0x160/0x160 [nfsd]
[ 80.018397] [<f892dc37>] ? nfsd_vfs_read.isra.16+0x117/0x160 [nfsd]
[ 80.018412] [<f892e764>] ? nfsd_read+0x1c4/0x280 [nfsd]
[ 80.018426] [<f89357bf>] ? nfsd3_proc_read+0xcf/0x160 [nfsd]
[ 80.018438] [<f892a7d0>] ? nfsd_dispatch+0xb0/0x190 [nfsd]
[ 80.018464] [<f88c3682>] ? svc_process+0x442/0x7c0 [sunrpc]
[ 80.018522] [<f892a0a3>] ? nfsd+0xa3/0x130 [nfsd]
[ 80.018532] [<f892a000>] ? 0xf8929fff
[ 80.018539] [<f892a000>] ? 0xf8929fff
[ 80.018549] [<c01389bc>] ? kthread+0x6c/0x80
[ 80.018558] [<c0138950>] ? kthread_freezable_should_stop+0x50/0x50
[ 80.018572] [<c04f2596>] ? kernel_thread_helper+0x6/0xd
[ 80.018578] ---[ end trace 0bc8170cf5ac546b ]---
[ 80.018584] ------------[ cut here ]------------
[ 80.018609] WARNING: at mm/filemap.c:941 find_get_pages_contig+0x177/0x1b0()
[ 80.018648] Hardware name: VT82C694X
[ 80.018654] Modules linked in: nfsd lockd sunrpc des_generic ecb crypto_blkcipher md4 md5 hmac cryptomgr aead cifs crypto_hash crypto_algapi crypto
firewire_ohci firewire_core
[ 80.018684] Pid: 1563, comm: nfsd Tainted: G W 3.4.0-omega #4
[ 80.018690] Call Trace:
[ 80.018705] [<c011ff88>] ? warn_slowpath_common+0x78/0xb0
[ 80.018725] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 80.018771] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 80.018782] [<c011ffd9>] ? warn_slowpath_null+0x19/0x20
[ 80.018791] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 80.018805] [<c01c544b>] ? __generic_file_splice_read+0xeb/0x510
[ 80.018816] [<c01c4040>] ? page_cache_pipe_buf_release+0x10/0x10
[ 80.018830] [<c022c9f7>] ? exportfs_decode_fh+0xc7/0x250
[ 80.018891] [<f893133d>] ? exp_get_by_name+0x3d/0x70 [nfsd]
[ 80.018903] [<c0150215>] ? getboottime+0x35/0x40
[ 80.018913] [<c04f0da8>] ? __schedule+0x198/0x470
[ 80.018947] [<f88cbf34>] ? sunrpc_cache_lookup+0x54/0x2d0 [sunrpc]
[ 80.018959] [<c01c58e3>] ? generic_file_splice_read+0x73/0x110
[ 80.019015] [<c01254bf>] ? irq_exit+0x4f/0x90
[ 80.019025] [<c01c5870>] ? __generic_file_splice_read+0x510/0x510
[ 80.019034] [<c01c4330>] ? do_splice_to+0x60/0x90
[ 80.019044] [<c01c459a>] ? splice_direct_to_actor+0xaa/0x1c0
[ 80.019058] [<f892d710>] ? nfsd_buffered_filldir+0x160/0x160 [nfsd]
[ 80.019071] [<f892dc37>] ? nfsd_vfs_read.isra.16+0x117/0x160 [nfsd]
[ 80.019084] [<f892e764>] ? nfsd_read+0x1c4/0x280 [nfsd]
[ 80.019139] [<f89357bf>] ? nfsd3_proc_read+0xcf/0x160 [nfsd]
[ 80.019152] [<f892a7d0>] ? nfsd_dispatch+0xb0/0x190 [nfsd]
[ 80.019177] [<f88c3682>] ? svc_process+0x442/0x7c0 [sunrpc]
[ 80.019189] [<f892a0a3>] ? nfsd+0xa3/0x130 [nfsd]
[ 80.019198] [<f892a000>] ? 0xf8929fff
[ 80.019205] [<f892a000>] ? 0xf8929fff
[ 80.019257] [<c01389bc>] ? kthread+0x6c/0x80
[ 80.019267] [<c0138950>] ? kthread_freezable_should_stop+0x50/0x50
[ 80.019279] [<c04f2596>] ? kernel_thread_helper+0x6/0xd
[ 80.019286] ---[ end trace 0bc8170cf5ac546c ]---
[ 91.774922] ------------[ cut here ]------------
[ 91.775125] WARNING: at mm/filemap.c:941 find_get_pages_contig+0x177/0x1b0()
[ 91.775327] Hardware name: VT82C694X
[ 91.775334] Modules linked in: nfsd lockd sunrpc des_generic ecb crypto_blkcipher md4 md5 hmac cryptomgr aead cifs crypto_hash crypto_algapi crypto
firewire_ohci firewire_core
[ 91.775483] Pid: 1564, comm: nfsd Tainted: G W 3.4.0-omega #4
[ 91.775570] Call Trace:
[ 91.775642] [<c011ff88>] ? warn_slowpath_common+0x78/0xb0
[ 91.775816] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 91.775829] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 91.775842] [<c011ffd9>] ? warn_slowpath_null+0x19/0x20
[ 91.796462] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 91.796487] [<c01c544b>] ? __generic_file_splice_read+0xeb/0x510
[ 91.796554] [<c01254bf>] ? irq_exit+0x4f/0x90
[ 91.796565] [<c01c4040>] ? page_cache_pipe_buf_release+0x10/0x10
[ 91.796580] [<c01b7cec>] ? iput+0x2c/0x210
[ 91.796588] [<c01b5a5f>] ? d_obtain_alias+0x2f/0x170
[ 91.796608] [<c0203f89>] ? ext3_fh_to_dentry+0x19/0x20
[ 91.796637] [<c022c9b7>] ? exportfs_decode_fh+0x87/0x250
[ 91.796706] [<f893133d>] ? exp_get_by_name+0x3d/0x70 [nfsd]
[ 91.796722] [<c04c35a6>] ? udp_recvmsg+0x186/0x2d0
[ 91.796731] [<c01254bf>] ? irq_exit+0x4f/0x90
[ 91.796755] [<c0103b03>] ? do_IRQ+0x43/0xa0
[ 91.796855] [<f88cdc71>] ? cache_check+0x71/0x3f0 [sunrpc]
[ 91.796925] [<c04f2589>] ? common_interrupt+0x29/0x30
[ 91.796941] [<c01c58e3>] ? generic_file_splice_read+0x73/0x110
[ 91.796951] [<c01c5870>] ? __generic_file_splice_read+0x510/0x510
[ 91.796960] [<c01c4330>] ? do_splice_to+0x60/0x90
[ 91.796969] [<c01c459a>] ? splice_direct_to_actor+0xaa/0x1c0
[ 91.796986] [<f892d710>] ? nfsd_buffered_filldir+0x160/0x160 [nfsd]
[ 91.797042] [<f892dc37>] ? nfsd_vfs_read.isra.16+0x117/0x160 [nfsd]
[ 91.797057] [<f892e764>] ? nfsd_read+0x1c4/0x280 [nfsd]
[ 91.797074] [<f89357bf>] ? nfsd3_proc_read+0xcf/0x160 [nfsd]
[ 91.797086] [<f892a7d0>] ? nfsd_dispatch+0xb0/0x190 [nfsd]
[ 91.797119] [<f88c3682>] ? svc_process+0x442/0x7c0 [sunrpc]
[ 91.797195] [<f892a0a3>] ? nfsd+0xa3/0x130 [nfsd]
[ 91.797207] [<f892a000>] ? 0xf8929fff
[ 91.797215] [<f892a000>] ? 0xf8929fff
[ 91.797227] [<c01389bc>] ? kthread+0x6c/0x80
[ 91.797236] [<c0138950>] ? kthread_freezable_should_stop+0x50/0x50
[ 91.797250] [<c04f2596>] ? kernel_thread_helper+0x6/0xd
[ 91.797259] ---[ end trace 0bc8170cf5ac546d ]---
[ 91.797311] ------------[ cut here ]------------
[ 91.797330] WARNING: at mm/filemap.c:941 find_get_pages_contig+0x177/0x1b0()
[ 91.797337] Hardware name: VT82C694X
[ 91.797341] Modules linked in: nfsd lockd sunrpc des_generic ecb crypto_blkcipher md4 md5 hmac cryptomgr aead cifs crypto_hash crypto_algapi crypto
firewire_ohci firewire_core
[ 91.797374] Pid: 1564, comm: nfsd Tainted: G W 3.4.0-omega #4
[ 91.797379] Call Trace:
[ 91.797412] [<c011ff88>] ? warn_slowpath_common+0x78/0xb0
[ 91.797495] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 91.797505] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 91.797565] [<c011ffd9>] ? warn_slowpath_null+0x19/0x20
[ 91.797576] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 91.797590] [<c01c544b>] ? __generic_file_splice_read+0xeb/0x510
[ 91.797604] [<c01254bf>] ? irq_exit+0x4f/0x90
[ 91.797614] [<c01c4040>] ? page_cache_pipe_buf_release+0x10/0x10
[ 91.797628] [<c01b7cec>] ? iput+0x2c/0x210
[ 91.797686] [<c01b5a5f>] ? d_obtain_alias+0x2f/0x170
[ 91.797697] [<c0203f89>] ? ext3_fh_to_dentry+0x19/0x20
[ 91.797711] [<c022c9b7>] ? exportfs_decode_fh+0x87/0x250
[ 91.797727] [<f893133d>] ? exp_get_by_name+0x3d/0x70 [nfsd]
[ 91.797737] [<c04c35a6>] ? udp_recvmsg+0x186/0x2d0
[ 91.797746] [<c01254bf>] ? irq_exit+0x4f/0x90
[ 91.797761] [<c0103b03>] ? do_IRQ+0x43/0xa0
[ 91.797841] [<f88cdc71>] ? cache_check+0x71/0x3f0 [sunrpc]
[ 91.797856] [<c04f2589>] ? common_interrupt+0x29/0x30
[ 91.797869] [<c01c58e3>] ? generic_file_splice_read+0x73/0x110
[ 91.797879] [<c01c5870>] ? __generic_file_splice_read+0x510/0x510
[ 91.797931] [<c01c4330>] ? do_splice_to+0x60/0x90
[ 91.797941] [<c01c459a>] ? splice_direct_to_actor+0xaa/0x1c0
[ 91.797957] [<f892d710>] ? nfsd_buffered_filldir+0x160/0x160 [nfsd]
[ 91.797971] [<f892dc37>] ? nfsd_vfs_read.isra.16+0x117/0x160 [nfsd]
[ 91.797985] [<f892e764>] ? nfsd_read+0x1c4/0x280 [nfsd]
[ 91.798000] [<f89357bf>] ? nfsd3_proc_read+0xcf/0x160 [nfsd]
[ 91.798052] [<f892a7d0>] ? nfsd_dispatch+0xb0/0x190 [nfsd]
[ 91.798078] [<f88c3682>] ? svc_process+0x442/0x7c0 [sunrpc]
[ 91.798090] [<f892a0a3>] ? nfsd+0xa3/0x130 [nfsd]
[ 91.798100] [<f892a000>] ? 0xf8929fff
[ 91.798107] [<f892a000>] ? 0xf8929fff
[ 91.798118] [<c01389bc>] ? kthread+0x6c/0x80
[ 91.798176] [<c0138950>] ? kthread_freezable_should_stop+0x50/0x50
[ 91.798188] [<c04f2596>] ? kernel_thread_helper+0x6/0xd
[ 91.798195] ---[ end trace 0bc8170cf5ac546e ]---
[ 91.798201] ------------[ cut here ]------------
[ 91.798216] WARNING: at mm/filemap.c:941 find_get_pages_contig+0x177/0x1b0()
[ 91.798222] Hardware name: VT82C694X
[ 91.798226] Modules linked in: nfsd lockd sunrpc des_generic ecb crypto_blkcipher md4 md5 hmac cryptomgr aead cifs crypto_hash crypto_algapi crypto
firewire_ohci firewire_core
[ 91.798306] Pid: 1564, comm: nfsd Tainted: G W 3.4.0-omega #4
[ 91.798312] Call Trace:
[ 91.798328] [<c011ff88>] ? warn_slowpath_common+0x78/0xb0
[ 91.798338] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 91.798347] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 91.798357] [<c011ffd9>] ? warn_slowpath_null+0x19/0x20
[ 91.798366] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 91.798426] [<c01c544b>] ? __generic_file_splice_read+0xeb/0x510
[ 91.798437] [<c01254bf>] ? irq_exit+0x4f/0x90
[ 91.798447] [<c01c4040>] ? page_cache_pipe_buf_release+0x10/0x10
[ 91.798459] [<c01b7cec>] ? iput+0x2c/0x210
[ 91.798467] [<c01b5a5f>] ? d_obtain_alias+0x2f/0x170
[ 91.798477] [<c0203f89>] ? ext3_fh_to_dentry+0x19/0x20
[ 91.798491] [<c022c9b7>] ? exportfs_decode_fh+0x87/0x250
[ 91.798551] [<f893133d>] ? exp_get_by_name+0x3d/0x70 [nfsd]
[ 91.798561] [<c04c35a6>] ? udp_recvmsg+0x186/0x2d0
[ 91.798571] [<c01254bf>] ? irq_exit+0x4f/0x90
[ 91.798579] [<c0103b03>] ? do_IRQ+0x43/0xa0
[ 91.798617] [<f88cdc71>] ? cache_check+0x71/0x3f0 [sunrpc]
[ 91.798673] [<c04f2589>] ? common_interrupt+0x29/0x30
[ 91.798687] [<c01c58e3>] ? generic_file_splice_read+0x73/0x110
[ 91.798697] [<c01c5870>] ? __generic_file_splice_read+0x510/0x510
[ 91.798706] [<c01c4330>] ? do_splice_to+0x60/0x90
[ 91.798716] [<c01c459a>] ? splice_direct_to_actor+0xaa/0x1c0
[ 91.798731] [<f892d710>] ? nfsd_buffered_filldir+0x160/0x160 [nfsd]
[ 91.798745] [<f892dc37>] ? nfsd_vfs_read.isra.16+0x117/0x160 [nfsd]
[ 91.798794] [<f892e764>] ? nfsd_read+0x1c4/0x280 [nfsd]
[ 91.798811] [<f89357bf>] ? nfsd3_proc_read+0xcf/0x160 [nfsd]
[ 91.798823] [<f892a7d0>] ? nfsd_dispatch+0xb0/0x190 [nfsd]
[ 91.798847] [<f88c3682>] ? svc_process+0x442/0x7c0 [sunrpc]
[ 91.798859] [<f892a0a3>] ? nfsd+0xa3/0x130 [nfsd]
[ 91.798912] [<f892a000>] ? 0xf8929fff
[ 91.798920] [<f892a000>] ? 0xf8929fff
[ 91.798929] [<c01389bc>] ? kthread+0x6c/0x80
[ 91.798938] [<c0138950>] ? kthread_freezable_should_stop+0x50/0x50
[ 91.798950] [<c04f2596>] ? kernel_thread_helper+0x6/0xd
[ 91.798957] ---[ end trace 0bc8170cf5ac546f ]---
[ 91.798963] ------------[ cut here ]------------
[ 91.798976] WARNING: at mm/filemap.c:941 find_get_pages_contig+0x177/0x1b0()
[ 91.798982] Hardware name: VT82C694X
[ 91.799028] Modules linked in: nfsd lockd sunrpc des_generic ecb crypto_blkcipher md4 md5 hmac cryptomgr aead cifs crypto_hash crypto_algapi crypto
firewire_ohci firewire_core
[ 91.799060] Pid: 1564, comm: nfsd Tainted: G W 3.4.0-omega #4
[ 91.799066] Call Trace:
[ 91.799080] [<c011ff88>] ? warn_slowpath_common+0x78/0xb0
[ 91.799091] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 91.799100] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 91.799110] [<c011ffd9>] ? warn_slowpath_null+0x19/0x20
[ 91.799161] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 91.799175] [<c01c544b>] ? __generic_file_splice_read+0xeb/0x510
[ 91.799186] [<c01254bf>] ? irq_exit+0x4f/0x90
[ 91.799195] [<c01c4040>] ? page_cache_pipe_buf_release+0x10/0x10
[ 91.799206] [<c01b7cec>] ? iput+0x2c/0x210
[ 91.799214] [<c01b5a5f>] ? d_obtain_alias+0x2f/0x170
[ 91.799225] [<c0203f89>] ? ext3_fh_to_dentry+0x19/0x20
[ 91.799288] [<c022c9b7>] ? exportfs_decode_fh+0x87/0x250
[ 91.799304] [<f893133d>] ? exp_get_by_name+0x3d/0x70 [nfsd]
[ 91.799314] [<c04c35a6>] ? udp_recvmsg+0x186/0x2d0
[ 91.799323] [<c01254bf>] ? irq_exit+0x4f/0x90
[ 91.799331] [<c0103b03>] ? do_IRQ+0x43/0xa0
[ 91.799377] [<f88cdc71>] ? cache_check+0x71/0x3f0 [sunrpc]
[ 91.799426] [<c04f2589>] ? common_interrupt+0x29/0x30
[ 91.799438] [<c01c58e3>] ? generic_file_splice_read+0x73/0x110
[ 91.799448] [<c01c5870>] ? __generic_file_splice_read+0x510/0x510
[ 91.799457] [<c01c4330>] ? do_splice_to+0x60/0x90
[ 91.799466] [<c01c459a>] ? splice_direct_to_actor+0xaa/0x1c0
[ 91.799484] [<f892d710>] ? nfsd_buffered_filldir+0x160/0x160 [nfsd]
[ 91.799527] [<f892dc37>] ? nfsd_vfs_read.isra.16+0x117/0x160 [nfsd]
[ 91.799542] [<f892e764>] ? nfsd_read+0x1c4/0x280 [nfsd]
[ 91.799557] [<f89357bf>] ? nfsd3_proc_read+0xcf/0x160 [nfsd]
[ 91.799569] [<f892a7d0>] ? nfsd_dispatch+0xb0/0x190 [nfsd]
[ 91.799592] [<f88c3682>] ? svc_process+0x442/0x7c0 [sunrpc]
[ 91.799654] [<f892a0a3>] ? nfsd+0xa3/0x130 [nfsd]
[ 91.799664] [<f892a000>] ? 0xf8929fff
[ 91.799671] [<f892a000>] ? 0xf8929fff
[ 91.799681] [<c01389bc>] ? kthread+0x6c/0x80
[ 91.799690] [<c0138950>] ? kthread_freezable_should_stop+0x50/0x50
[ 91.799703] [<c04f2596>] ? kernel_thread_helper+0x6/0xd
[ 91.799710] ---[ end trace 0bc8170cf5ac5470 ]---
[ 91.799715] ------------[ cut here ]------------
[ 91.799743] WARNING: at mm/filemap.c:941 find_get_pages_contig+0x177/0x1b0()
[ 91.799783] Hardware name: VT82C694X
[ 91.799789] Modules linked in: nfsd lockd sunrpc des_generic ecb crypto_blkcipher md4 md5 hmac cryptomgr aead cifs crypto_hash crypto_algapi crypto
firewire_ohci firewire_core
[ 91.799820] Pid: 1564, comm: nfsd Tainted: G W 3.4.0-omega #4
[ 91.799825] Call Trace:
[ 91.799840] [<c011ff88>] ? warn_slowpath_common+0x78/0xb0
[ 91.799925] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 91.799935] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 91.799946] [<c011ffd9>] ? warn_slowpath_null+0x19/0x20
[ 91.799956] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 91.799968] [<c01c544b>] ? __generic_file_splice_read+0xeb/0x510
[ 91.799979] [<c01254bf>] ? irq_exit+0x4f/0x90
[ 91.799989] [<c01c4040>] ? page_cache_pipe_buf_release+0x10/0x10
[ 91.800000] [<c01b7cec>] ? iput+0x2c/0x210
[ 91.820316] [<c01b5a5f>] ? d_obtain_alias+0x2f/0x170
[ 91.820387] [<c0203f89>] ? ext3_fh_to_dentry+0x19/0x20
[ 91.820415] [<c022c9b7>] ? exportfs_decode_fh+0x87/0x250
[ 91.820484] [<f893133d>] ? exp_get_by_name+0x3d/0x70 [nfsd]
[ 91.820500] [<c04c35a6>] ? udp_recvmsg+0x186/0x2d0
[ 91.820516] [<c01254bf>] ? irq_exit+0x4f/0x90
[ 91.820537] [<c0103b03>] ? do_IRQ+0x43/0xa0
[ 91.820673] [<f88cdc71>] ? cache_check+0x71/0x3f0 [sunrpc]
[ 91.820728] [<c04f2589>] ? common_interrupt+0x29/0x30
[ 91.820750] [<c01c58e3>] ? generic_file_splice_read+0x73/0x110
[ 91.820760] [<c01c5870>] ? __generic_file_splice_read+0x510/0x510
[ 91.820769] [<c01c4330>] ? do_splice_to+0x60/0x90
[ 91.820823] [<c01c459a>] ? splice_direct_to_actor+0xaa/0x1c0
[ 91.820840] [<f892d710>] ? nfsd_buffered_filldir+0x160/0x160 [nfsd]
[ 91.820854] [<f892dc37>] ? nfsd_vfs_read.isra.16+0x117/0x160 [nfsd]
[ 91.820868] [<f892e764>] ? nfsd_read+0x1c4/0x280 [nfsd]
[ 91.820886] [<f89357bf>] ? nfsd3_proc_read+0xcf/0x160 [nfsd]
[ 91.820901] [<f892a7d0>] ? nfsd_dispatch+0xb0/0x190 [nfsd]
[ 91.820964] [<f88c3682>] ? svc_process+0x442/0x7c0 [sunrpc]
[ 91.820977] [<f892a0a3>] ? nfsd+0xa3/0x130 [nfsd]
[ 91.820990] [<f892a000>] ? 0xf8929fff
[ 91.820997] [<f892a000>] ? 0xf8929fff
[ 91.821009] [<c01389bc>] ? kthread+0x6c/0x80
[ 91.821019] [<c0138950>] ? kthread_freezable_should_stop+0x50/0x50
[ 91.821075] [<c04f2596>] ? kernel_thread_helper+0x6/0xd
[ 91.821084] ---[ end trace 0bc8170cf5ac5471 ]---
[ 91.821091] ------------[ cut here ]------------
[ 91.821112] WARNING: at mm/filemap.c:941 find_get_pages_contig+0x177/0x1b0()
[ 91.821119] Hardware name: VT82C694X
[ 91.821123] Modules linked in: nfsd lockd sunrpc des_generic ecb crypto_blkcipher md4 md5 hmac cryptomgr aead cifs crypto_hash crypto_algapi crypto
firewire_ohci firewire_core
[ 91.821205] Pid: 1564, comm: nfsd Tainted: G W 3.4.0-omega #4
[ 91.821211] Call Trace:
[ 91.821233] [<c011ff88>] ? warn_slowpath_common+0x78/0xb0
[ 91.821244] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 91.821253] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 91.821262] [<c011ffd9>] ? warn_slowpath_null+0x19/0x20
[ 91.821318] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 91.821332] [<c01c544b>] ? __generic_file_splice_read+0xeb/0x510
[ 91.821343] [<c01254bf>] ? irq_exit+0x4f/0x90
[ 91.821353] [<c01c4040>] ? page_cache_pipe_buf_release+0x10/0x10
[ 91.821365] [<c01b7cec>] ? iput+0x2c/0x210
[ 91.821373] [<c01b5a5f>] ? d_obtain_alias+0x2f/0x170
[ 91.821383] [<c0203f89>] ? ext3_fh_to_dentry+0x19/0x20
[ 91.821438] [<c022c9b7>] ? exportfs_decode_fh+0x87/0x250
[ 91.821454] [<f893133d>] ? exp_get_by_name+0x3d/0x70 [nfsd]
[ 91.821463] [<c04c35a6>] ? udp_recvmsg+0x186/0x2d0
[ 91.821472] [<c01254bf>] ? irq_exit+0x4f/0x90
[ 91.821481] [<c0103b03>] ? do_IRQ+0x43/0xa0
[ 91.821529] [<f88cdc71>] ? cache_check+0x71/0x3f0 [sunrpc]
[ 91.821599] [<c04f2589>] ? common_interrupt+0x29/0x30
[ 91.821616] [<c01c58e3>] ? generic_file_splice_read+0x73/0x110
[ 91.821627] [<c01c5870>] ? __generic_file_splice_read+0x510/0x510
[ 91.821636] [<c01c4330>] ? do_splice_to+0x60/0x90
[ 91.821717] [<c01c459a>] ? splice_direct_to_actor+0xaa/0x1c0
[ 91.821734] [<f892d710>] ? nfsd_buffered_filldir+0x160/0x160 [nfsd]
[ 91.821748] [<f892dc37>] ? nfsd_vfs_read.isra.16+0x117/0x160 [nfsd]
[ 91.821762] [<f892e764>] ? nfsd_read+0x1c4/0x280 [nfsd]
[ 91.821881] [<f89357bf>] ? nfsd3_proc_read+0xcf/0x160 [nfsd]
[ 91.821896] [<f892a7d0>] ? nfsd_dispatch+0xb0/0x190 [nfsd]
[ 91.821938] [<f88c3682>] ? svc_process+0x442/0x7c0 [sunrpc]
[ 91.822017] [<f892a0a3>] ? nfsd+0xa3/0x130 [nfsd]
[ 91.825357] [<f892a000>] ? 0xf8929fff
[ 91.825368] [<f892a000>] ? 0xf8929fff
[ 91.825383] [<c01389bc>] ? kthread+0x6c/0x80
[ 91.825402] [<c0138950>] ? kthread_freezable_should_stop+0x50/0x50
[ 91.825457] [<c04f2596>] ? kernel_thread_helper+0x6/0xd
[ 91.825466] ---[ end trace 0bc8170cf5ac5472 ]---
[ 91.825473] ------------[ cut here ]------------
[ 91.825496] WARNING: at mm/filemap.c:941 find_get_pages_contig+0x177/0x1b0()
[ 91.825502] Hardware name: VT82C694X
[ 91.825506] Modules linked in: nfsd lockd sunrpc des_generic ecb crypto_blkcipher md4 md5 hmac cryptomgr aead cifs crypto_hash crypto_algapi crypto
firewire_ohci firewire_core
[ 91.825585] Pid: 1564, comm: nfsd Tainted: G W 3.4.0-omega #4
[ 91.825590] Call Trace:
[ 91.825612] [<c011ff88>] ? warn_slowpath_common+0x78/0xb0
[ 91.825622] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 91.825632] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 91.825689] [<c011ffd9>] ? warn_slowpath_null+0x19/0x20
[ 91.825699] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
[ 91.825719] [<c01c544b>] ? __generic_file_splice_read+0xeb/0x510
[ 91.825734] [<c01254bf>] ? irq_exit+0x4f/0x90
[ 91.825744] [<c01c4040>] ? page_cache_pipe_buf_release+0x10/0x10
[ 91.825759] [<c01b7cec>] ? iput+0x2c/0x210
[ 91.825812] [<c01b5a5f>] ? d_obtain_alias+0x2f/0x170
[ 91.825833] [<c0203f89>] ? ext3_fh_to_dentry+0x19/0x20
[ 91.825850] [<c022c9b7>] ? exportfs_decode_fh+0x87/0x250
[ 91.825881] [<f893133d>] ? exp_get_by_name+0x3d/0x70 [nfsd]
[ 91.825939] [<c04c35a6>] ? udp_recvmsg+0x186/0x2d0
[ 91.825949] [<c01254bf>] ? irq_exit+0x4f/0x90
[ 91.825960] [<c0103b03>] ? do_IRQ+0x43/0xa0
[ 91.826039] [<f88cdc71>] ? cache_check+0x71/0x3f0 [sunrpc]
[ 91.826102] [<c04f2589>] ? common_interrupt+0x29/0x30
[ 91.826117] [<c01c58e3>] ? generic_file_splice_read+0x73/0x110
[ 91.826127] [<c01c5870>] ? __generic_file_splice_read+0x510/0x510
[ 91.826179] [<c01c4330>] ? do_splice_to+0x60/0x90
[ 91.826189] [<c01c459a>] ? splice_direct_to_actor+0xaa/0x1c0
[ 91.826206] [<f892d710>] ? nfsd_buffered_filldir+0x160/0x160 [nfsd]
[ 91.826219] [<f892dc37>] ? nfsd_vfs_read.isra.16+0x117/0x160 [nfsd]
[ 91.826233] [<f892e764>] ? nfsd_read+0x1c4/0x280 [nfsd]
[ 91.826249] [<f89357bf>] ? nfsd3_proc_read+0xcf/0x160 [nfsd]
[ 91.826305] [<f892a7d0>] ? nfsd_dispatch+0xb0/0x190 [nfsd]
[ 91.826332] [<f88c3682>] ? svc_process+0x442/0x7c0 [sunrpc]
[ 91.826344] [<f892a0a3>] ? nfsd+0xa3/0x130 [nfsd]
[ 91.826353] [<f892a000>] ? 0xf8929fff
[ 91.826360] [<f892a000>] ? 0xf8929fff
[ 91.826370] [<c01389bc>] ? kthread+0x6c/0x80
[ 91.826427] [<c0138950>] ? kthread_freezable_should_stop+0x50/0x50
[ 91.826439] [<c04f2596>] ? kernel_thread_helper+0x6/0xd
[ 91.826446] ---[ end trace 0bc8170cf5ac5473 ]---


--
Ondrej Zary

2012-06-05 14:52:14

by Konstantin Khlebnikov

[permalink] [raw]
Subject: Re: [bisected commit 0fc9d10] NFS-server corruption with 3.4

Hmm, very interesting!
Please try this patch, it must fix the problem and print some numbers to debug.

Ondrej Zary wrote:
> On Tuesday 05 June 2012, Konstantin Khlebnikov wrote:
>> Ondrej Zary wrote:
>>> Hello,
>>> I use NFS for deploying HDD images on new machines. My machine has 2nd
>>> network card just for this, running DHCPD, TFTPD and kernel NFS server.
>>> The target machine is set to boot from LAN and boots SystemRescueCD from
>>> my machine with an autorun script that launches Partimage and deploys the
>>> HDD image (400 to 900 MB compressed).
>>>
>>> It worked fine for years, until now. With kernel 3.4, everyting
>>> works only for the first time after boot (and not always). Next time
>>> (next machine), partimage aborts almost immediately as it's probably
>>> unable to decompress the image file. md5sum is different on my machine
>>> vs. on the target (through NFS). Also SystemRescueCD boot aborts with md5
>>> error sometimes. Everything works fine after rebooting back to 3.3.
>>>
>>> Bisection found this:
>>>
>>> 0fc9d1040313047edf6a39fd4d7c7defdca97c62 is the first bad commit
>>> commit 0fc9d1040313047edf6a39fd4d7c7defdca97c62
>>> Author: Konstantin Khlebnikov<[email protected]>
>>> Date: Wed Mar 28 14:42:54 2012 -0700
>>>
>>> radix-tree: use iterators in find_get_pages* functions
>>>
>>> Reverting this commit in 3.4 fixes the problem.
>>
>> [all reporters added to CC] let's keep all in one thread
>>
>> In attachment two patches which might help to debug this regression:
>>
>> "mm: recheck page index in find_get_pages_contig" adds paranoid check into
>> find_get_pages_contig(). It can explain everything, but currently I don't
>> see how this can hapens.
>>
>> "mm: debug fing_get_pages speculative restart" shows lookup restarting
>> condition which was removed by bisected commit.
>
> My dmesg (after corruption occured) with these two patches applied:
>
> [ 79.999511] ------------[ cut here ]------------
> [ 79.999564] WARNING: at mm/filemap.c:941 find_get_pages_contig+0x177/0x1b0()
> [ 79.999611] Hardware name: VT82C694X
> [ 79.999617] Modules linked in: nfsd lockd sunrpc des_generic ecb crypto_blkcipher md4 md5 hmac cryptomgr aead cifs crypto_hash crypto_algapi crypto
> firewire_ohci firewire_core
> [ 79.999653] Pid: 1563, comm: nfsd Not tainted 3.4.0-omega #4
> [ 79.999659] Call Trace:
> [ 79.999729] [<c011ff88>] ? warn_slowpath_common+0x78/0xb0
> [ 79.999744] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
> [ 79.999753] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
> [ 79.999763] [<c011ffd9>] ? warn_slowpath_null+0x19/0x20
> [ 79.999772] [<c0175187>] ? find_get_pages_contig+0x177/0x1b0
> [ 79.999805] [<c01c544b>] ? __generic_file_splice_read+0xeb/0x510
> [ 79.999853] [<c01c4040>] ? page_cache_pipe_buf_release+0x10/0x10
> [ 79.999873] [<c04f2589>] ? common_interrupt+0x29/0x30
> [ 79.999900] [<f892c710>] ? _fh_update.isra.11.part.12+0x60/0x60 [nfsd]
> [ 79.999931] [<c022c9f7>] ? exportfs_decode_fh+0xc7/0x250
> [ 79.999981] [<f893133d>] ? exp_get_by_name+0x3d/0x70 [nfsd]
> [ 80.000000] [<c0150215>] ? getboottime+0x35/0x40
> [ 80.007383] [<c04f0da8>] ? __schedule+0x198/0x470
> [ 80.007505] [<f88cbf34>] ? sunrpc_cache_lookup+0x54/0x2d0 [sunrpc]
> [ 80.007574] [<c01c58e3>] ? generic_file_splice_read+0x73/0x110
> [ 80.007590] [<c01254bf>] ? irq_exit+0x4f/0x90
> [ 80.007599] [<c01c5870>] ? __generic_file_splice_read+0x510/0x510
> [ 80.007608] [<c01c4330>] ? do_splice_to+0x60/0x90
> [ 80.007618] [<c01c459a>] ? splice_direct_to_actor+0xaa/0x1c0
> [ 80.007654] [<f892d710>] ? nfsd_buffered_filldir+0x160/0x160 [nfsd]
> [ 80.007700] [<f892dc37>] ? nfsd_vfs_read.isra.16+0x117/0x160 [nfsd]
> [ 80.007715] [<f892e764>] ? nfsd_read+0x1c4/0x280 [nfsd]
> [ 80.007732] [<f89357bf>] ? nfsd3_proc_read+0xcf/0x160 [nfsd]
> [ 80.007745] [<f892a7d0>] ? nfsd_dispatch+0xb0/0x190 [nfsd]
> [ 80.007779] [<f88c3682>] ? svc_process+0x442/0x7c0 [sunrpc]
> [ 80.007825] [<f892a0a3>] ? nfsd+0xa3/0x130 [nfsd]
> [ 80.007838] [<f892a000>] ? 0xf8929fff
> [ 80.007846] [<f892a000>] ? 0xf8929fff
> [ 80.007858] [<c01389bc>] ? kthread+0x6c/0x80
> [ 80.007867] [<c0138950>] ? kthread_freezable_should_stop+0x50/0x50
> [ 80.007896] [<c04f2596>] ? kernel_thread_helper+0x6/0xd
> [ 80.007937] ---[ end trace 0bc8170cf5ac5466 ]---


Attachments:
mm-fix-find_get_pages_contig (761.00 B)

2012-06-05 15:07:46

by OGAWA Hirofumi

[permalink] [raw]
Subject: Re: [bisected commit 0fc9d10] NFS-server corruption with 3.4

Konstantin Khlebnikov <[email protected]> writes:

> Hmm, very interesting!
> Please try this patch, it must fix the problem and print some numbers to debug.
>

I think the bug is in radix_tree_for_each_contig().

radix_tree_next_slot() returns NULL if the slot was NULL (i.e. there is
hole). But, slot == NULL is not meaning to stop iterate here. Actually,
if slot is NULL, it gets next chunk.

Bang.
--
OGAWA Hirofumi <[email protected]>

2012-06-05 15:14:38

by Konstantin Khlebnikov

[permalink] [raw]
Subject: Re: [bisected commit 0fc9d10] NFS-server corruption with 3.4

OGAWA Hirofumi wrote:
> Konstantin Khlebnikov<[email protected]> writes:
>
>> Hmm, very interesting!
>> Please try this patch, it must fix the problem and print some numbers to debug.
>>
>
> I think the bug is in radix_tree_for_each_contig().
>
> radix_tree_next_slot() returns NULL if the slot was NULL (i.e. there is
> hole). But, slot == NULL is not meaning to stop iterate here. Actually,
> if slot is NULL, it gets next chunk.
>
> Bang.

Yeah, you are right, I already found this too.
Currently I think how to fix this more accurately...

2012-06-05 15:59:21

by Konstantin Khlebnikov

[permalink] [raw]
Subject: Re: [bisected commit 0fc9d10] NFS-server corruption with 3.4

Proper fix in attachment.

Konstantin Khlebnikov wrote:
> OGAWA Hirofumi wrote:
>> Konstantin Khlebnikov<[email protected]> writes:
>>
>>> Hmm, very interesting!
>>> Please try this patch, it must fix the problem and print some numbers to debug.
>>>
>>
>> I think the bug is in radix_tree_for_each_contig().
>>
>> radix_tree_next_slot() returns NULL if the slot was NULL (i.e. there is
>> hole). But, slot == NULL is not meaning to stop iterate here. Actually,
>> if slot is NULL, it gets next chunk.
>>
>> Bang.
>
> Yeah, you are right, I already found this too.
> Currently I think how to fix this more accurately...
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/


Attachments:
radix-tree-fix-contiguous-iterator (1.82 kB)

2012-06-05 16:18:57

by OGAWA Hirofumi

[permalink] [raw]
Subject: Re: [bisected commit 0fc9d10] NFS-server corruption with 3.4

Konstantin Khlebnikov <[email protected]> writes:

> Proper fix in attachment.

Maybe, you are going to add the stable tag for next stable?

Thanks.

> radix-tree: fix contiguous iterator
>
> From: Konstantin Khlebnikov <[email protected]>
>
> This patch fixes bug in macro radix_tree_for_each_contig().
> If radix_tree_next_slot() sees NULL in next slot it returns NULL, but following
> radix_tree_next_chunk() switches iterating into next chunk. As result iterating
> becomes non-contiguous and breaks vfs "splice" and all its users.

--
OGAWA Hirofumi <[email protected]>

2012-06-05 16:39:20

by Konstantin Khlebnikov

[permalink] [raw]
Subject: Re: [bisected commit 0fc9d10] NFS-server corruption with 3.4

OGAWA Hirofumi wrote:
> Konstantin Khlebnikov<[email protected]> writes:
>
>> Proper fix in attachment.
>
> Maybe, you are going to add the stable tag for next stable?
>
> Thanks.

Yes, this definitely must be in next stable 3.4.x , but first I'll wait for confirmation.

Guys, who can reproduce this, please check patch "radix-tree: fix contiguous iterator"
from my previous mail in this thread.

>
>> radix-tree: fix contiguous iterator
>>
>> From: Konstantin Khlebnikov<[email protected]>
>>
>> This patch fixes bug in macro radix_tree_for_each_contig().
>> If radix_tree_next_slot() sees NULL in next slot it returns NULL, but following
>> radix_tree_next_chunk() switches iterating into next chunk. As result iterating
>> becomes non-contiguous and breaks vfs "splice" and all its users.
>

2012-06-05 17:03:54

by Toralf Förster

[permalink] [raw]
Subject: Re: [bisected commit 0fc9d10] NFS-server corruption with 3.4

On 06/05/2012 05:59 PM, Konstantin Khlebnikov wrote:
> Proper fix in attachment.
>
That fixes solves the chroot issue I had have - tested at top of
v3.5-rc1-37-g99becf1

--
MfG/Sincerely
Toralf F?rster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3

2012-06-05 17:17:30

by Konstantin Khlebnikov

[permalink] [raw]
Subject: Re: [bisected commit 0fc9d10] NFS-server corruption with 3.4

Toralf F?rster wrote:
> On 06/05/2012 05:59 PM, Konstantin Khlebnikov wrote:
>> Proper fix in attachment.
>>
> That fixes solves the chroot issue I had have - tested at top of
> v3.5-rc1-37-g99becf1
>

Ok, thanks.

2012-06-05 23:32:04

by Hans de Bruin

[permalink] [raw]
Subject: Re: [bisected commit 0fc9d10] NFS-server corruption with 3.4

On 06/05/2012 06:39 PM, Konstantin Khlebnikov wrote:
> OGAWA Hirofumi wrote:
>> Konstantin Khlebnikov<[email protected]> writes:
>>
>>> Proper fix in attachment.
>>
>> Maybe, you are going to add the stable tag for next stable?
>>
>> Thanks.
>
> Yes, this definitely must be in next stable 3.4.x , but first I'll wait
> for confirmation.
>
> Guys, who can reproduce this, please check patch "radix-tree: fix
> contiguous iterator"
> from my previous mail in this thread.
>
>>
>>> radix-tree: fix contiguous iterator
>>>
>>> From: Konstantin Khlebnikov<[email protected]>
>>>
>>> This patch fixes bug in macro radix_tree_for_each_contig().
>>> If radix_tree_next_slot() sees NULL in next slot it returns NULL, but
>>> following
>>> radix_tree_next_chunk() switches iterating into next chunk. As result
>>> iterating
>>> becomes non-contiguous and breaks vfs "splice" and all its users.
>>
>

I patched on to off v3.4 and Firefox an Thunderbird do not segfault
anymore. The do not start either. This was de feature on my 'production'
server I could not reproduce on my test server. Maybe it has something
to with the different type of nic's.

For the second attempt I branched of at 0fc9d1040313047edf6a39fd and
applied your patch on top of it. Firefox an Thunderbird where back
again. So your patch works.

Now I need some git-instructions. Apparently something else is broken. I
branched of with:

git branch debug 0fc9d1040313047edf6a3
git checkout debug
applied the patch.
git commit -a (got commit 5c09c685ba2d36c3b905220d43ad1b47354e456eed back)

Now I want all commits in my master branch up until ref v3.4 added to my
debug branch so I can bisect between 5c09 and v3.4

--
Hans








2012-06-06 08:56:57

by Ondrej Zary

[permalink] [raw]
Subject: Re: [bisected commit 0fc9d10] NFS-server corruption with 3.4

On Tuesday 05 June 2012, Konstantin Khlebnikov wrote:
> Proper fix in attachment.

Applying this patch to clean 3.4 fixes the problem.

--
Ondrej Zary

2012-06-06 10:55:35

by Konstantin Khlebnikov

[permalink] [raw]
Subject: Re: [bisected commit 0fc9d10] NFS-server corruption with 3.4

Hans de Bruin wrote:
> On 06/05/2012 06:39 PM, Konstantin Khlebnikov wrote:
>> OGAWA Hirofumi wrote:
>>> Konstantin Khlebnikov<[email protected]> writes:
>>>
>>>> Proper fix in attachment.
>>>
>>> Maybe, you are going to add the stable tag for next stable?
>>>
>>> Thanks.
>>
>> Yes, this definitely must be in next stable 3.4.x , but first I'll wait
>> for confirmation.
>>
>> Guys, who can reproduce this, please check patch "radix-tree: fix
>> contiguous iterator"
>> from my previous mail in this thread.
>>
>>>
>>>> radix-tree: fix contiguous iterator
>>>>
>>>> From: Konstantin Khlebnikov<[email protected]>
>>>>
>>>> This patch fixes bug in macro radix_tree_for_each_contig().
>>>> If radix_tree_next_slot() sees NULL in next slot it returns NULL, but
>>>> following
>>>> radix_tree_next_chunk() switches iterating into next chunk. As result
>>>> iterating
>>>> becomes non-contiguous and breaks vfs "splice" and all its users.
>>>
>>
>
> I patched on to off v3.4 and Firefox an Thunderbird do not segfault
> anymore. The do not start either. This was de feature on my 'production'
> server I could not reproduce on my test server. Maybe it has something
> to with the different type of nic's.
>
> For the second attempt I branched of at 0fc9d1040313047edf6a39fd and
> applied your patch on top of it. Firefox an Thunderbird where back
> again. So your patch works.
>
> Now I need some git-instructions. Apparently something else is broken. I
> branched of with:
>
> git branch debug 0fc9d1040313047edf6a3
> git checkout debug
> applied the patch.
> git commit -a (got commit 5c09c685ba2d36c3b905220d43ad1b47354e456eed back)
>
> Now I want all commits in my master branch up until ref v3.4 added to my
> debug branch so I can bisect between 5c09 and v3.4
>

No, it does not work in this way. you should apply patch before each test.
You can fix radix-tree bug in find_get_pages_contig() to avoid changing
radix-tree.h and rebuild kernel faster.

--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -927,7 +927,7 @@ repeat:
* otherwise we can get both false positives and false
* negatives, which is just confusing to the caller.
*/
- if (page->mapping == NULL || page->index != iter.index) {
+ if (page->mapping == NULL || page->index != index + ret) {
page_cache_release(page);
break;
}