2014-07-15 08:14:58

by Mike Qiu

[permalink] [raw]
Subject: Bug_ON with patch: bio: modify __bio_add_page() to accept pages that don't start a new segment

My Power7 box boot fail with commit:

254c4407cb84a6dec90336054615b0f0e996bb7c
bio: modify __bio_add_page() to accept pages that don't start a new segment

Just revert it will works for me.

See below:

[ 22.659431] ------------[ cut here ]------------
[ 22.659437] kernel BUG at fs/direct-io.c:747!
[ 22.659501] Oops: Exception in kernel mode, sig: 5 [#1]
[ 22.659528] SMP NR_CPUS=1024 NUMA PowerNV
[ 22.659533] Modules linked in: e1000e vhost_net tun ses(+) macvtap
macvlan enclosure ptp pps_core vhost be2net(+) shpchp kvm binfmt_misc
uinput lpfc scsi_transport_fc ipr
[ 22.659688] CPU: 8 PID: 772 Comm: lvm Not tainted
3.16.0-rc5-next-20140714+ #76
[ 22.659755] task: c0000003b0a7dc20 ti: c0000003b0afc000 task.ti:
c0000003b0afc000
[ 22.659823] NIP: c0000000002ba854 LR: c0000000002bad80 CTR:
0000000000000010
[ 22.659890] REGS: c0000003b0aff450 TRAP: 0700 Not tainted
(3.16.0-rc5-next-20140714+)
[ 22.659957] MSR: 9000000000029032 <SF,HV,EE,ME,IR,DR,RI> CR:
24222844 XER: 20000000
[ 22.660114] CFAR: c0000000002bad90 SOFTE: 1
GPR00: c0000000002bad80 c0000003b0aff6d0 c00000000145c148 0000000000000000
GPR04: 0000000000000000 0000000000000000 c000000000b6e7c8 0000000000000001
GPR08: 0000000000000000 0000000000010000 0000000000100000 f000000000000000
GPR12: 0000000024222844 c00000000fee2400 0000000000000010 c0000003b9140000
GPR16: 0000000000010000 c0000003b9140000 0000000000047bff 0000000000010000
GPR20: 0000000000000000 f000000000cb0fdc 0000000000010000 0000000000010000
GPR24: 0000000000000000 0000000000010000 0000000000000000 c0000003b0afc000
GPR28: 0000000000000000 00000000023dff80 c0000003fcb10380 c0000003b9140028
[ 22.660980] NIP [c0000000002ba854] .__blockdev_direct_IO+0x1584/0x3960
[ 22.661036] LR [c0000000002bad80] .__blockdev_direct_IO+0x1ab0/0x3960
[ 22.661092] Call Trace:
[ 22.661116] [c0000003b0aff6d0] [c0000000002bad80]
.__blockdev_direct_IO+0x1ab0/0x3960 (unreliable)
[ 22.661208] [c0000003b0aff980] [c0000000002b6114]
.blkdev_direct_IO+0x64/0x80
[ 22.661276] [c0000003b0affa20] [c0000000001dd430]
.generic_file_read_iter+0x5b0/0x690
[ 22.661355] [c0000003b0affb50] [c0000000002b5a40]
.blkdev_read_iter+0x60/0x90
[ 22.661423] [c0000003b0affbd0] [c000000000269d28]
.new_sync_read+0xa8/0x120
[ 22.661491] [c0000003b0affcf0] [c00000000026b280] .vfs_read+0xc0/0x1f0
[ 22.661559] [c0000003b0affd90] [c00000000026b674] .SyS_read+0x64/0x110
[ 22.661628] [c0000003b0affe30] [c00000000000a158] syscall_exit+0x0/0x98
[ 22.661695] Instruction dump:
[ 22.661729] e88100d8 80a100e4 80c100e0 f92100c0 39200000 912100a8
4814fe15 60000000
[ 22.661841] 812100e4 78630020 7f891800 419ef880 <0fe00000> 60000000
60420000 e9410118
[ 22.661955] ---[ end trace 6248a5bb36020fd2 ]---

Thanks,
Mike




2014-07-15 08:41:48

by Jens Axboe

[permalink] [raw]
Subject: Re: Bug_ON with patch: bio: modify __bio_add_page() to accept pages that don't start a new segment

On 15/07/2014, at 10.14, Mike Qiu <[email protected]> wrote:
>
> My Power7 box boot fail with commit:
>
> 254c4407cb84a6dec90336054615b0f0e996bb7c
> bio: modify __bio_add_page() to accept pages that don't start a new segment
>
> Just revert it will works for me.

I have reverted it yesterday in my tree.

2014-07-15 08:43:02

by Mike Qiu

[permalink] [raw]
Subject: Re: Bug_ON with patch: bio: modify __bio_add_page() to accept pages that don't start a new segment

On 07/15/2014 04:41 PM, Jens Axboe wrote:
> On 15/07/2014, at 10.14, Mike Qiu <[email protected]> wrote:
>> My Power7 box boot fail with commit:
>>
>> 254c4407cb84a6dec90336054615b0f0e996bb7c
>> bio: modify __bio_add_page() to accept pages that don't start a new segment
>>
>> Just revert it will works for me.
> I have reverted it yesterday in my tree.
>

OK, that's fine :)

Thanks
Mike

2014-07-15 08:44:55

by Maurizio Lombardi

[permalink] [raw]
Subject: Re: Bug_ON with patch: bio: modify __bio_add_page() to accept pages that don't start a new segment



On 07/15/2014 10:41 AM, Jens Axboe wrote:
> On 15/07/2014, at 10.14, Mike Qiu <[email protected]> wrote:
>>
>> My Power7 box boot fail with commit:
>>
>> 254c4407cb84a6dec90336054615b0f0e996bb7c
>> bio: modify __bio_add_page() to accept pages that don't start a new segment
>>
>> Just revert it will works for me.
>
> I have reverted it yesterday in my tree.
>


The problem was here:

if (q->merge_bvec_fn) {
struct bvec_merge_data bvm = {
.bi_bdev = bio->bi_bdev,
.bi_sector = bio->bi_iter.bi_sector,
.bi_size = bio->bi_iter.bi_size, <-------
.bi_rw = bio->bi_rw,
};

/*
* merge_bvec_fn() returns number of bytes it can accept
* at this offset
*/
if (q->merge_bvec_fn(q, &bvm, bvec) < bvec->bv_len)
goto failed;
}

/* If we may be able to merge these biovecs, force a recount */
if (bio->bi_vcnt > 1 && (BIOVEC_PHYS_MERGEABLE(bvec-1, bvec)))
bio->bi_flags &= ~(1 << BIO_SEG_VALID);


it should have been ".bi_size = bio->bi_iter.bi_size - len"

Regards,
Maurizio Lombardi

2014-07-15 11:38:49

by Maurizio Lombardi

[permalink] [raw]
Subject: Re: Bug_ON with patch: bio: modify __bio_add_page() to accept pages that don't start a new segment



On 07/15/2014 10:14 AM, Mike Qiu wrote:
> My Power7 box boot fail with commit:
>
> 254c4407cb84a6dec90336054615b0f0e996bb7c
> bio: modify __bio_add_page() to accept pages that don't start a new segment
>
> Just revert it will works for me.

This looks strange to me because, even after reverting my patch, the kernel still panics
with a different call trace:


[ 68.999477] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
[ 69.007411] IP: [<ffffffff81322238>] blk_throtl_drain+0x28/0x110
[ 69.013510] PGD 222e5f067 PUD 21bdab067 PMD 0
[ 69.018051] Oops: 0000 [#1] SMP
[ 69.021335] Modules linked in: serpent_avx_x86_64 serpent_sse2_x86_64 serpent_generic dm_crypt loop iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi cfg80211 rfkill x86_pkg_temp_thermal coretemp kvm_intel bnx2x kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel e1000e tpm_infineon ptp iTCO_wdt pcspkr iTCO_vendor_support tpm_tis pps_core microcode ipmi_si serio_raw i2c_i801 ipmi_msghandler mdio tpm ie31200_edac video lpc_ich mfd_core shpchp edac_core xfs libcrc32c ast i2c_algo_bit drm_kms_helper ttm drm ata_generic pata_acpi
[ 69.070767] CPU: 3 PID: 11130 Comm: cryptsetup Not tainted 3.16.0-rc5-next-20140714+ #2
[ 69.078862] Hardware name: wortmann To be filled by O.E.M./P8B-M Series, BIOS 6103 12/06/2012
[ 69.087413] task: ffff8802225c0930 ti: ffff88021bda0000 task.ti: ffff88021bda0000
[ 69.094977] RIP: 0010:[<ffffffff81322238>] [<ffffffff81322238>] blk_throtl_drain+0x28/0x110
[ 69.103529] RSP: 0018:ffff88021bda3b60 EFLAGS: 00010046
[ 69.108866] RAX: 0000000000000000 RBX: ffff880222001638 RCX: 000000007fffffff
[ 69.116110] RDX: 000000000000000b RSI: 0000000000000000 RDI: 0000000000000000
[ 69.123337] RBP: ffff88021bda3b78 R08: 0000000000000000 R09: ffff8800c9011ff0
[ 69.130562] R10: ffff88021bda3b78 R11: ffffffff8132fd61 R12: ffff880222001638
[ 69.137764] R13: ffff88021c84e600 R14: ffff880222001c30 R15: 0000000000000000
[ 69.144981] FS: 00007f81fd8fb840(0000) GS:ffff88022fd80000(0000) knlGS:0000000000000000
[ 69.153144] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 69.158931] CR2: 0000000000000028 CR3: 000000021bdac000 CR4: 00000000000407e0
[ 69.166131] Stack:
[ 69.168160] ffff880222001638 0000000000000000 ffff880222001c40 ffff88021bda3b88
[ 69.175671] ffffffff8131f36e ffff88021bda3bb8 ffffffff8130294e ffff880222001638
[ 69.183182] ffffffff819c9580 ffff880222001638 ffff88021af91800 ffff88021bda3bd0
[ 69.190740] Call Trace:
[ 69.193218] [<ffffffff8131f36e>] blkcg_drain_queue+0xe/0x10
[ 69.198921] [<ffffffff8130294e>] __blk_drain_queue+0x6e/0x150
[ 69.204804] [<ffffffff81302a8d>] blk_queue_bypass_start+0x5d/0x80
[ 69.211050] [<ffffffff8131e668>] blkcg_deactivate_policy+0x38/0x120
[ 69.217437] [<ffffffff81322454>] blk_throtl_exit+0x34/0x50
[ 69.223094] [<ffffffff8131f3aa>] blkcg_exit_queue+0x3a/0x40
[ 69.228822] [<ffffffff81305ee6>] blk_release_queue+0x26/0xe0
[ 69.234600] [<ffffffff8132fd47>] kobject_cleanup+0x77/0x1b0
[ 69.240345] [<ffffffff8132fbf8>] kobject_put+0x28/0x60
[ 69.245596] [<ffffffff81302b9b>] blk_cleanup_queue+0xeb/0x120
[ 69.251464] [<ffffffff81546cb1>] __dm_destroy+0x1d1/0x250
[ 69.257043] [<ffffffff81547a53>] dm_destroy+0x13/0x20
[ 69.262265] [<ffffffff8154d1fe>] dev_remove+0x11e/0x180
[ 69.267614] [<ffffffff8154d0e0>] ? dev_suspend+0x250/0x250
[ 69.273269] [<ffffffff8154d8d6>] ctl_ioctl+0x246/0x4e0
[ 69.278549] [<ffffffff8154db83>] dm_ctl_ioctl+0x13/0x20
[ 69.283920] [<ffffffff811e08d8>] do_vfs_ioctl+0x2d8/0x4b0
[ 69.289458] [<ffffffff811e0b31>] SyS_ioctl+0x81/0xa0
[ 69.294544] [<ffffffff811166b6>] ? __audit_syscall_exit+0x236/0x2e0
[ 69.300964] [<ffffffff816d5369>] system_call_fastpath+0x16/0x1b
[ 69.307020] Code: 00 00 00 66 66 66 66 90 55 48 89 e5 41 55 41 54 49 89 fc 53 4c 8b af e0 06 00 00 49 8b 85 88 00 00 00 31 ff 48 8b 80 68 05 00 00 <48> 8b 70 28 e8 3f 44 de ff 48 85 c0 48 89 c3 74 61 0f 1f 80 00
[ 69.327190] RIP [<ffffffff81322238>] blk_throtl_drain+0x28/0x110
[ 69.333402] RSP <ffff88021bda3b60>
[ 69.336937] CR2: 0000000000000028
[ 69.340275] ---[ end trace 3c89b44a6f5a2b9f ]---

Regards,
Maurizio Lombardi

2014-07-15 12:01:52

by Maurizio Lombardi

[permalink] [raw]
Subject: Re: Bug_ON with patch: bio: modify __bio_add_page() to accept pages that don't start a new segment



On 07/15/2014 01:38 PM, Maurizio Lombardi wrote:
>
>
> On 07/15/2014 10:14 AM, Mike Qiu wrote:
>> My Power7 box boot fail with commit:
>>
>> 254c4407cb84a6dec90336054615b0f0e996bb7c
>> bio: modify __bio_add_page() to accept pages that don't start a new segment
>>
>> Just revert it will works for me.
>
> This looks strange to me because, even after reverting my patch, the kernel still panics
> with a different call trace:


Never mind, this is another bug in linux-next that is hit by cryptsetup.
Sorry for the noise.

2014-07-16 06:51:34

by Maurizio Lombardi

[permalink] [raw]
Subject: Re: Bug_ON with patch: bio: modify __bio_add_page() to accept pages that don't start a new segment

Hi,

On 07/15/2014 10:44 AM, Maurizio Lombardi wrote:
>> I have reverted it yesterday in my tree.
>>
>
>
> The problem was here:
>
> if (q->merge_bvec_fn) {
> struct bvec_merge_data bvm = {
> .bi_bdev = bio->bi_bdev,
> .bi_sector = bio->bi_iter.bi_sector,
> .bi_size = bio->bi_iter.bi_size, <-------
> .bi_rw = bio->bi_rw,
> };
>
> /*
> * merge_bvec_fn() returns number of bytes it can accept
> * at this offset
> */
> if (q->merge_bvec_fn(q, &bvm, bvec) < bvec->bv_len)
> goto failed;
> }
>
> /* If we may be able to merge these biovecs, force a recount */
> if (bio->bi_vcnt > 1 && (BIOVEC_PHYS_MERGEABLE(bvec-1, bvec)))
> bio->bi_flags &= ~(1 << BIO_SEG_VALID);
>
>
> it should have been ".bi_size = bio->bi_iter.bi_size - len"
>

Jens, will you restore the patch in your tree if I submit
this fix?

Thanks,
Maurizio Lombardi

2014-07-16 07:54:03

by Jens Axboe

[permalink] [raw]
Subject: Re: Bug_ON with patch: bio: modify __bio_add_page() to accept pages that don't start a new segment

On 2014-07-16 08:51, Maurizio Lombardi wrote:
> Hi,
>
> On 07/15/2014 10:44 AM, Maurizio Lombardi wrote:
>>> I have reverted it yesterday in my tree.
>>>
>>
>>
>> The problem was here:
>>
>> if (q->merge_bvec_fn) {
>> struct bvec_merge_data bvm = {
>> .bi_bdev = bio->bi_bdev,
>> .bi_sector = bio->bi_iter.bi_sector,
>> .bi_size = bio->bi_iter.bi_size, <-------
>> .bi_rw = bio->bi_rw,
>> };
>>
>> /*
>> * merge_bvec_fn() returns number of bytes it can accept
>> * at this offset
>> */
>> if (q->merge_bvec_fn(q, &bvm, bvec) < bvec->bv_len)
>> goto failed;
>> }
>>
>> /* If we may be able to merge these biovecs, force a recount */
>> if (bio->bi_vcnt > 1 && (BIOVEC_PHYS_MERGEABLE(bvec-1, bvec)))
>> bio->bi_flags &= ~(1 << BIO_SEG_VALID);
>>
>>
>> it should have been ".bi_size = bio->bi_iter.bi_size - len"
>>
>
> Jens, will you restore the patch in your tree if I submit
> this fix?

Sure, we can try again, hopefully this will be the last of them.

--
Jens Axboe

2014-07-22 14:08:59

by Maurizio Lombardi

[permalink] [raw]
Subject: Re: Bug_ON with patch: bio: modify __bio_add_page() to accept pages that don't start a new segment

Hi Jens,

On 07/16/2014 09:53 AM, Jens Axboe wrote:
>
> Sure, we can try again, hopefully this will be the last of them.
>

I sent it, it must be applied on top of
"bio: modify __bio_add_page() to accept pages that don't start a new segment"

http://marc.info/?l=linux-kernel&m=140558697215009&w=2

I really hope this is the last fix to it.

Regards,
Maurizio Lombardi