2015-07-29 13:27:18

by Josh Boyer

[permalink] [raw]
Subject: cgroup/loop Bad page state oops in Linux v4.2-rc3-136-g45b4b782e848

Hi All,

We've gotten a report[1] that any of the upcoming Fedora 23 install
images are all failing on 32-bit VMs/machines. Looking at the first
instance of the oops, it seems to be a bad page state where a page is
still charged to a group and it is trying to be freed. The oops
output is below.

Has anyone seen this in their 32-bit testing at all? Thus far nobody
can recreate this on a 64-bit machine/VM.

josh

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1247382

[ 9.026738] systemd[1]: Switching root.
[ 9.036467] systemd-journald[149]: Received SIGTERM from PID 1 (systemd).
[ 9.082262] BUG: Bad page state in process kworker/u5:1 pfn:372ac
[ 9.083989] page:f3d32ae0 count:0 mapcount:0 mapping:f2252178 index:0x16a
[ 9.085755] flags: 0x40020021(locked|lru|mappedtodisk)
[ 9.087284] page dumped because: page still charged to cgroup
[ 9.088772] bad because of flags:
[ 9.089731] flags: 0x21(locked|lru)
[ 9.090818] page->mem_cgroup:f2c3e400
[ 9.091862] Modules linked in: loop nls_utf8 isofs 8021q garp stp
llc 8139too mrp 8139cp crc32_pclmul ata_generic crc32c_intel qxl
syscopyarea sysfillrect sysimgblt drm_kms_helper serio_raw mii
virtio_pci ttm pata_acpi drm scsi_dh_rdac scsi_dh_emc scsi_dh_alua
sunrpc dm_crypt dm_round_robin linear raid10 raid456 async_raid6_recov
async_memcpy async_pq async_xor xor async_tx raid6_pq raid1 raid0
iscsi_ibft iscsi_boot_sysfs floppy iscsi_tcp libiscsi_tcp libiscsi
scsi_transport_iscsi squashfs cramfs edd dm_multipath
[ 9.104829] CPU: 0 PID: 745 Comm: kworker/u5:1 Not tainted
4.2.0-0.rc3.git4.1.fc23.i686 #1
[ 9.106987] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.7.5-20140709_153950- 04/01/2014
[ 9.109445] Workqueue: kloopd1 loop_queue_read_work [loop]
[ 9.110982] c0d439a7 af8cdfed 00000000 f6cfbd1c c0aa22c9 f3d32ae0
f6cfbd40 c054e30a
[ 9.113298] c0c6e4e0 f6dfc228 000372ac 00b13ce1 c0c7271d f2252178
00000000 f6cfbd60
[ 9.115562] c054eea9 f6cfbd5c 00000000 00000000 f3d32ae0 f3494000
40020021 f6cfbd8c
[ 9.117848] Call Trace:
[ 9.118738] [<c0aa22c9>] dump_stack+0x41/0x52
[ 9.120034] [<c054e30a>] bad_page.part.80+0xaa/0x100
[ 9.121461] [<c054eea9>] free_pages_prepare+0x3b9/0x3f0
[ 9.122934] [<c054fae2>] free_hot_cold_page+0x22/0x160
[ 9.124400] [<c071a22f>] ? copy_to_iter+0x1af/0x2a0
[ 9.125750] [<c054c4a3>] ? mempool_free_slab+0x13/0x20
[ 9.126840] [<c054fc57>] __free_pages+0x37/0x50
[ 9.127849] [<c054c4fd>] mempool_free_pages+0xd/0x10
[ 9.128908] [<c054c8b6>] mempool_free+0x26/0x80
[ 9.129895] [<c06f77e6>] bounce_end_io+0x56/0x80
[ 9.130923] [<c06f7ce2>] bounce_end_io_read+0x32/0x40
[ 9.131973] [<c06d8dc6>] bio_endio+0x56/0x90
[ 9.132953] [<c06df817>] blk_update_request+0x87/0x310
[ 9.134042] [<c04499f7>] ? kvm_clock_read+0x17/0x20
[ 9.135103] [<c040bdd8>] ? sched_clock+0x8/0x10
[ 9.136100] [<c06e7756>] blk_mq_end_request+0x16/0x60
[ 9.136912] [<c06e7fed>] __blk_mq_complete_request+0x9d/0xd0
[ 9.137730] [<c06e8035>] blk_mq_complete_request+0x15/0x20
[ 9.138515] [<f7e0851d>] loop_handle_cmd.isra.23+0x5d/0x8c0 [loop]
[ 9.139390] [<c0491b53>] ? pick_next_task_fair+0xa63/0xbb0
[ 9.140202] [<f7e08e60>] loop_queue_read_work+0x10/0x12 [loop]
[ 9.141043] [<c0471c55>] process_one_work+0x145/0x380
[ 9.141779] [<c0471ec9>] worker_thread+0x39/0x430
[ 9.142524] [<c0471e90>] ? process_one_work+0x380/0x380
[ 9.143303] [<c04772b6>] kthread+0xa6/0xc0
[ 9.143936] [<c0aa7a81>] ret_from_kernel_thread+0x21/0x30
[ 9.144742] [<c0477210>] ? kthread_worker_fn+0x130/0x130
[ 9.145529] Disabling lock debugging due to kernel taint


2015-07-29 13:51:40

by Johannes Weiner

[permalink] [raw]
Subject: Re: cgroup/loop Bad page state oops in Linux v4.2-rc3-136-g45b4b782e848

On Wed, Jul 29, 2015 at 09:27:16AM -0400, Josh Boyer wrote:
> Hi All,
>
> We've gotten a report[1] that any of the upcoming Fedora 23 install
> images are all failing on 32-bit VMs/machines. Looking at the first
> instance of the oops, it seems to be a bad page state where a page is
> still charged to a group and it is trying to be freed. The oops
> output is below.
>
> Has anyone seen this in their 32-bit testing at all? Thus far nobody
> can recreate this on a 64-bit machine/VM.
>
> josh
>
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1247382
>
> [ 9.026738] systemd[1]: Switching root.
> [ 9.036467] systemd-journald[149]: Received SIGTERM from PID 1 (systemd).
> [ 9.082262] BUG: Bad page state in process kworker/u5:1 pfn:372ac
> [ 9.083989] page:f3d32ae0 count:0 mapcount:0 mapping:f2252178 index:0x16a
> [ 9.085755] flags: 0x40020021(locked|lru|mappedtodisk)
> [ 9.087284] page dumped because: page still charged to cgroup
> [ 9.088772] bad because of flags:
> [ 9.089731] flags: 0x21(locked|lru)
> [ 9.090818] page->mem_cgroup:f2c3e400

It's also still locked and on the LRU. This page shouldn't have been
freed.

> [ 9.117848] Call Trace:
> [ 9.118738] [<c0aa22c9>] dump_stack+0x41/0x52
> [ 9.120034] [<c054e30a>] bad_page.part.80+0xaa/0x100
> [ 9.121461] [<c054eea9>] free_pages_prepare+0x3b9/0x3f0
> [ 9.122934] [<c054fae2>] free_hot_cold_page+0x22/0x160
> [ 9.124400] [<c071a22f>] ? copy_to_iter+0x1af/0x2a0
> [ 9.125750] [<c054c4a3>] ? mempool_free_slab+0x13/0x20
> [ 9.126840] [<c054fc57>] __free_pages+0x37/0x50
> [ 9.127849] [<c054c4fd>] mempool_free_pages+0xd/0x10
> [ 9.128908] [<c054c8b6>] mempool_free+0x26/0x80
> [ 9.129895] [<c06f77e6>] bounce_end_io+0x56/0x80

The page state looks completely off for a bounce buffer page. Did
somebody mess with a bounce bio's bv_page?

> [ 9.130923] [<c06f7ce2>] bounce_end_io_read+0x32/0x40
> [ 9.131973] [<c06d8dc6>] bio_endio+0x56/0x90
> [ 9.132953] [<c06df817>] blk_update_request+0x87/0x310
> [ 9.134042] [<c04499f7>] ? kvm_clock_read+0x17/0x20
> [ 9.135103] [<c040bdd8>] ? sched_clock+0x8/0x10
> [ 9.136100] [<c06e7756>] blk_mq_end_request+0x16/0x60
> [ 9.136912] [<c06e7fed>] __blk_mq_complete_request+0x9d/0xd0
> [ 9.137730] [<c06e8035>] blk_mq_complete_request+0x15/0x20
> [ 9.138515] [<f7e0851d>] loop_handle_cmd.isra.23+0x5d/0x8c0 [loop]
> [ 9.139390] [<c0491b53>] ? pick_next_task_fair+0xa63/0xbb0
> [ 9.140202] [<f7e08e60>] loop_queue_read_work+0x10/0x12 [loop]
> [ 9.141043] [<c0471c55>] process_one_work+0x145/0x380
> [ 9.141779] [<c0471ec9>] worker_thread+0x39/0x430
> [ 9.142524] [<c0471e90>] ? process_one_work+0x380/0x380
> [ 9.143303] [<c04772b6>] kthread+0xa6/0xc0
> [ 9.143936] [<c0aa7a81>] ret_from_kernel_thread+0x21/0x30
> [ 9.144742] [<c0477210>] ? kthread_worker_fn+0x130/0x130
> [ 9.145529] Disabling lock debugging due to kernel taint

2015-07-29 15:32:10

by Ming Lei

[permalink] [raw]
Subject: Re: cgroup/loop Bad page state oops in Linux v4.2-rc3-136-g45b4b782e848

On Wed, Jul 29, 2015 at 9:51 AM, Johannes Weiner <[email protected]> wrote:
> On Wed, Jul 29, 2015 at 09:27:16AM -0400, Josh Boyer wrote:
>> Hi All,
>>
>> We've gotten a report[1] that any of the upcoming Fedora 23 install
>> images are all failing on 32-bit VMs/machines. Looking at the first
>> instance of the oops, it seems to be a bad page state where a page is
>> still charged to a group and it is trying to be freed. The oops
>> output is below.
>>
>> Has anyone seen this in their 32-bit testing at all? Thus far nobody
>> can recreate this on a 64-bit machine/VM.
>>
>> josh
>>
>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1247382
>>
>> [ 9.026738] systemd[1]: Switching root.
>> [ 9.036467] systemd-journald[149]: Received SIGTERM from PID 1 (systemd).
>> [ 9.082262] BUG: Bad page state in process kworker/u5:1 pfn:372ac
>> [ 9.083989] page:f3d32ae0 count:0 mapcount:0 mapping:f2252178 index:0x16a
>> [ 9.085755] flags: 0x40020021(locked|lru|mappedtodisk)
>> [ 9.087284] page dumped because: page still charged to cgroup
>> [ 9.088772] bad because of flags:
>> [ 9.089731] flags: 0x21(locked|lru)
>> [ 9.090818] page->mem_cgroup:f2c3e400
>
> It's also still locked and on the LRU. This page shouldn't have been
> freed.
>
>> [ 9.117848] Call Trace:
>> [ 9.118738] [<c0aa22c9>] dump_stack+0x41/0x52
>> [ 9.120034] [<c054e30a>] bad_page.part.80+0xaa/0x100
>> [ 9.121461] [<c054eea9>] free_pages_prepare+0x3b9/0x3f0
>> [ 9.122934] [<c054fae2>] free_hot_cold_page+0x22/0x160
>> [ 9.124400] [<c071a22f>] ? copy_to_iter+0x1af/0x2a0
>> [ 9.125750] [<c054c4a3>] ? mempool_free_slab+0x13/0x20
>> [ 9.126840] [<c054fc57>] __free_pages+0x37/0x50
>> [ 9.127849] [<c054c4fd>] mempool_free_pages+0xd/0x10
>> [ 9.128908] [<c054c8b6>] mempool_free+0x26/0x80
>> [ 9.129895] [<c06f77e6>] bounce_end_io+0x56/0x80
>
> The page state looks completely off for a bounce buffer page. Did
> somebody mess with a bounce bio's bv_page?

Looks the page isn't touched in both lo_read_transfer() and
lo_read_simple().

Maybe it is related with aa4d86163e4e(block: loop: switch to VFS ITER_BVEC),
or it might be helpful to run 'git bisect' if reverting aa4d86163e4e can't
fix the issue, suppose the issue can be reproduced easily.

>
>> [ 9.130923] [<c06f7ce2>] bounce_end_io_read+0x32/0x40
>> [ 9.131973] [<c06d8dc6>] bio_endio+0x56/0x90
>> [ 9.132953] [<c06df817>] blk_update_request+0x87/0x310
>> [ 9.134042] [<c04499f7>] ? kvm_clock_read+0x17/0x20
>> [ 9.135103] [<c040bdd8>] ? sched_clock+0x8/0x10
>> [ 9.136100] [<c06e7756>] blk_mq_end_request+0x16/0x60
>> [ 9.136912] [<c06e7fed>] __blk_mq_complete_request+0x9d/0xd0
>> [ 9.137730] [<c06e8035>] blk_mq_complete_request+0x15/0x20
>> [ 9.138515] [<f7e0851d>] loop_handle_cmd.isra.23+0x5d/0x8c0 [loop]
>> [ 9.139390] [<c0491b53>] ? pick_next_task_fair+0xa63/0xbb0
>> [ 9.140202] [<f7e08e60>] loop_queue_read_work+0x10/0x12 [loop]
>> [ 9.141043] [<c0471c55>] process_one_work+0x145/0x380
>> [ 9.141779] [<c0471ec9>] worker_thread+0x39/0x430
>> [ 9.142524] [<c0471e90>] ? process_one_work+0x380/0x380
>> [ 9.143303] [<c04772b6>] kthread+0xa6/0xc0
>> [ 9.143936] [<c0aa7a81>] ret_from_kernel_thread+0x21/0x30
>> [ 9.144742] [<c0477210>] ? kthread_worker_fn+0x130/0x130
>> [ 9.145529] Disabling lock debugging due to kernel taint

2015-07-29 16:36:53

by Josh Boyer

[permalink] [raw]
Subject: Re: cgroup/loop Bad page state oops in Linux v4.2-rc3-136-g45b4b782e848

On Wed, Jul 29, 2015 at 11:32 AM, Ming Lei <[email protected]> wrote:
> On Wed, Jul 29, 2015 at 9:51 AM, Johannes Weiner <[email protected]> wrote:
>> On Wed, Jul 29, 2015 at 09:27:16AM -0400, Josh Boyer wrote:
>>> Hi All,
>>>
>>> We've gotten a report[1] that any of the upcoming Fedora 23 install
>>> images are all failing on 32-bit VMs/machines. Looking at the first
>>> instance of the oops, it seems to be a bad page state where a page is
>>> still charged to a group and it is trying to be freed. The oops
>>> output is below.
>>>
>>> Has anyone seen this in their 32-bit testing at all? Thus far nobody
>>> can recreate this on a 64-bit machine/VM.
>>>
>>> josh
>>>
>>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1247382
>>>
>>> [ 9.026738] systemd[1]: Switching root.
>>> [ 9.036467] systemd-journald[149]: Received SIGTERM from PID 1 (systemd).
>>> [ 9.082262] BUG: Bad page state in process kworker/u5:1 pfn:372ac
>>> [ 9.083989] page:f3d32ae0 count:0 mapcount:0 mapping:f2252178 index:0x16a
>>> [ 9.085755] flags: 0x40020021(locked|lru|mappedtodisk)
>>> [ 9.087284] page dumped because: page still charged to cgroup
>>> [ 9.088772] bad because of flags:
>>> [ 9.089731] flags: 0x21(locked|lru)
>>> [ 9.090818] page->mem_cgroup:f2c3e400
>>
>> It's also still locked and on the LRU. This page shouldn't have been
>> freed.
>>
>>> [ 9.117848] Call Trace:
>>> [ 9.118738] [<c0aa22c9>] dump_stack+0x41/0x52
>>> [ 9.120034] [<c054e30a>] bad_page.part.80+0xaa/0x100
>>> [ 9.121461] [<c054eea9>] free_pages_prepare+0x3b9/0x3f0
>>> [ 9.122934] [<c054fae2>] free_hot_cold_page+0x22/0x160
>>> [ 9.124400] [<c071a22f>] ? copy_to_iter+0x1af/0x2a0
>>> [ 9.125750] [<c054c4a3>] ? mempool_free_slab+0x13/0x20
>>> [ 9.126840] [<c054fc57>] __free_pages+0x37/0x50
>>> [ 9.127849] [<c054c4fd>] mempool_free_pages+0xd/0x10
>>> [ 9.128908] [<c054c8b6>] mempool_free+0x26/0x80
>>> [ 9.129895] [<c06f77e6>] bounce_end_io+0x56/0x80
>>
>> The page state looks completely off for a bounce buffer page. Did
>> somebody mess with a bounce bio's bv_page?
>
> Looks the page isn't touched in both lo_read_transfer() and
> lo_read_simple().
>
> Maybe it is related with aa4d86163e4e(block: loop: switch to VFS ITER_BVEC),
> or it might be helpful to run 'git bisect' if reverting aa4d86163e4e can't
> fix the issue, suppose the issue can be reproduced easily.

I can try reverting that and getting someone to test it. It is
somewhat complicated by having to spin a new install ISO, so a report
back will be somewhat delayed. In the meantime, I'm also asking
people to track down the first kernel build that hits this, so
hopefully that gives us more of a clue as well.

It is odd that only 32-bit hits this issue though. At least from what
we've seen thus far.

josh

2015-07-30 00:29:09

by Ming Lei

[permalink] [raw]
Subject: Re: cgroup/loop Bad page state oops in Linux v4.2-rc3-136-g45b4b782e848

On Wed, Jul 29, 2015 at 12:36 PM, Josh Boyer <[email protected]> wrote:
> On Wed, Jul 29, 2015 at 11:32 AM, Ming Lei <[email protected]> wrote:
>> On Wed, Jul 29, 2015 at 9:51 AM, Johannes Weiner <[email protected]> wrote:
>>> On Wed, Jul 29, 2015 at 09:27:16AM -0400, Josh Boyer wrote:
>>>> Hi All,
>>>>
>>>> We've gotten a report[1] that any of the upcoming Fedora 23 install
>>>> images are all failing on 32-bit VMs/machines. Looking at the first
>>>> instance of the oops, it seems to be a bad page state where a page is
>>>> still charged to a group and it is trying to be freed. The oops
>>>> output is below.
>>>>
>>>> Has anyone seen this in their 32-bit testing at all? Thus far nobody
>>>> can recreate this on a 64-bit machine/VM.
>>>>
>>>> josh
>>>>
>>>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1247382
>>>>
>>>> [ 9.026738] systemd[1]: Switching root.
>>>> [ 9.036467] systemd-journald[149]: Received SIGTERM from PID 1 (systemd).
>>>> [ 9.082262] BUG: Bad page state in process kworker/u5:1 pfn:372ac
>>>> [ 9.083989] page:f3d32ae0 count:0 mapcount:0 mapping:f2252178 index:0x16a
>>>> [ 9.085755] flags: 0x40020021(locked|lru|mappedtodisk)
>>>> [ 9.087284] page dumped because: page still charged to cgroup
>>>> [ 9.088772] bad because of flags:
>>>> [ 9.089731] flags: 0x21(locked|lru)
>>>> [ 9.090818] page->mem_cgroup:f2c3e400
>>>
>>> It's also still locked and on the LRU. This page shouldn't have been
>>> freed.
>>>
>>>> [ 9.117848] Call Trace:
>>>> [ 9.118738] [<c0aa22c9>] dump_stack+0x41/0x52
>>>> [ 9.120034] [<c054e30a>] bad_page.part.80+0xaa/0x100
>>>> [ 9.121461] [<c054eea9>] free_pages_prepare+0x3b9/0x3f0
>>>> [ 9.122934] [<c054fae2>] free_hot_cold_page+0x22/0x160
>>>> [ 9.124400] [<c071a22f>] ? copy_to_iter+0x1af/0x2a0
>>>> [ 9.125750] [<c054c4a3>] ? mempool_free_slab+0x13/0x20
>>>> [ 9.126840] [<c054fc57>] __free_pages+0x37/0x50
>>>> [ 9.127849] [<c054c4fd>] mempool_free_pages+0xd/0x10
>>>> [ 9.128908] [<c054c8b6>] mempool_free+0x26/0x80
>>>> [ 9.129895] [<c06f77e6>] bounce_end_io+0x56/0x80
>>>
>>> The page state looks completely off for a bounce buffer page. Did
>>> somebody mess with a bounce bio's bv_page?
>>
>> Looks the page isn't touched in both lo_read_transfer() and
>> lo_read_simple().
>>
>> Maybe it is related with aa4d86163e4e(block: loop: switch to VFS ITER_BVEC),
>> or it might be helpful to run 'git bisect' if reverting aa4d86163e4e can't
>> fix the issue, suppose the issue can be reproduced easily.
>
> I can try reverting that and getting someone to test it. It is
> somewhat complicated by having to spin a new install ISO, so a report
> back will be somewhat delayed. In the meantime, I'm also asking
> people to track down the first kernel build that hits this, so
> hopefully that gives us more of a clue as well.
>
> It is odd that only 32-bit hits this issue though. At least from what
> we've seen thus far.

Page bounce may be just valid on 32-bit, and I will try to find one ARM
box to see if it can be reproduced easily.

BTW, are there any extra steps for reproducing the issue? Such as
cgroup operations?

Thanks,

2015-07-30 11:27:10

by Josh Boyer

[permalink] [raw]
Subject: Re: cgroup/loop Bad page state oops in Linux v4.2-rc3-136-g45b4b782e848

On Wed, Jul 29, 2015 at 8:29 PM, Ming Lei <[email protected]> wrote:
> On Wed, Jul 29, 2015 at 12:36 PM, Josh Boyer <[email protected]> wrote:
>> On Wed, Jul 29, 2015 at 11:32 AM, Ming Lei <[email protected]> wrote:
>>> On Wed, Jul 29, 2015 at 9:51 AM, Johannes Weiner <[email protected]> wrote:
>>>> On Wed, Jul 29, 2015 at 09:27:16AM -0400, Josh Boyer wrote:
>>>>> Hi All,
>>>>>
>>>>> We've gotten a report[1] that any of the upcoming Fedora 23 install
>>>>> images are all failing on 32-bit VMs/machines. Looking at the first
>>>>> instance of the oops, it seems to be a bad page state where a page is
>>>>> still charged to a group and it is trying to be freed. The oops
>>>>> output is below.
>>>>>
>>>>> Has anyone seen this in their 32-bit testing at all? Thus far nobody
>>>>> can recreate this on a 64-bit machine/VM.
>>>>>
>>>>> josh
>>>>>
>>>>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1247382
>>>>>
>>>>> [ 9.026738] systemd[1]: Switching root.
>>>>> [ 9.036467] systemd-journald[149]: Received SIGTERM from PID 1 (systemd).
>>>>> [ 9.082262] BUG: Bad page state in process kworker/u5:1 pfn:372ac
>>>>> [ 9.083989] page:f3d32ae0 count:0 mapcount:0 mapping:f2252178 index:0x16a
>>>>> [ 9.085755] flags: 0x40020021(locked|lru|mappedtodisk)
>>>>> [ 9.087284] page dumped because: page still charged to cgroup
>>>>> [ 9.088772] bad because of flags:
>>>>> [ 9.089731] flags: 0x21(locked|lru)
>>>>> [ 9.090818] page->mem_cgroup:f2c3e400
>>>>
>>>> It's also still locked and on the LRU. This page shouldn't have been
>>>> freed.
>>>>
>>>>> [ 9.117848] Call Trace:
>>>>> [ 9.118738] [<c0aa22c9>] dump_stack+0x41/0x52
>>>>> [ 9.120034] [<c054e30a>] bad_page.part.80+0xaa/0x100
>>>>> [ 9.121461] [<c054eea9>] free_pages_prepare+0x3b9/0x3f0
>>>>> [ 9.122934] [<c054fae2>] free_hot_cold_page+0x22/0x160
>>>>> [ 9.124400] [<c071a22f>] ? copy_to_iter+0x1af/0x2a0
>>>>> [ 9.125750] [<c054c4a3>] ? mempool_free_slab+0x13/0x20
>>>>> [ 9.126840] [<c054fc57>] __free_pages+0x37/0x50
>>>>> [ 9.127849] [<c054c4fd>] mempool_free_pages+0xd/0x10
>>>>> [ 9.128908] [<c054c8b6>] mempool_free+0x26/0x80
>>>>> [ 9.129895] [<c06f77e6>] bounce_end_io+0x56/0x80
>>>>
>>>> The page state looks completely off for a bounce buffer page. Did
>>>> somebody mess with a bounce bio's bv_page?
>>>
>>> Looks the page isn't touched in both lo_read_transfer() and
>>> lo_read_simple().
>>>
>>> Maybe it is related with aa4d86163e4e(block: loop: switch to VFS ITER_BVEC),
>>> or it might be helpful to run 'git bisect' if reverting aa4d86163e4e can't
>>> fix the issue, suppose the issue can be reproduced easily.
>>
>> I can try reverting that and getting someone to test it. It is
>> somewhat complicated by having to spin a new install ISO, so a report
>> back will be somewhat delayed. In the meantime, I'm also asking
>> people to track down the first kernel build that hits this, so
>> hopefully that gives us more of a clue as well.
>>
>> It is odd that only 32-bit hits this issue though. At least from what
>> we've seen thus far.
>
> Page bounce may be just valid on 32-bit, and I will try to find one ARM
> box to see if it can be reproduced easily.
>
> BTW, are there any extra steps for reproducing the issue? Such as
> cgroup operations?

I'm not entirely sure what the install environment on the ISOs is
doing, but nobody sees this issue with a kernel after install. Thus
far recreate efforts have focused on recreating the install ISOs using
various kernels. That is working, but I don't expect other people to
easily be able to do that.

Also, our primary tester seems to have narrowed it down to breaking
somewhere between 4.1-rc5 (good) and 4.1-rc6 (bad). I'll be working
with him today to isolate it further, but the commit you pointed out
was in 4.1-rc1 and that worked. He still needs to test a 4.2-rc4
kernel with it reverted, but so far it seems to be something else that
came in with the 4.1 kernel.

josh

2015-07-30 23:14:13

by Josh Boyer

[permalink] [raw]
Subject: Re: cgroup/loop Bad page state oops in Linux v4.2-rc3-136-g45b4b782e848

On Thu, Jul 30, 2015 at 7:27 AM, Josh Boyer <[email protected]> wrote:
> On Wed, Jul 29, 2015 at 8:29 PM, Ming Lei <[email protected]> wrote:
>> On Wed, Jul 29, 2015 at 12:36 PM, Josh Boyer <[email protected]> wrote:
>>> On Wed, Jul 29, 2015 at 11:32 AM, Ming Lei <[email protected]> wrote:
>>>> On Wed, Jul 29, 2015 at 9:51 AM, Johannes Weiner <[email protected]> wrote:
>>>>> On Wed, Jul 29, 2015 at 09:27:16AM -0400, Josh Boyer wrote:
>>>>>> Hi All,
>>>>>>
>>>>>> We've gotten a report[1] that any of the upcoming Fedora 23 install
>>>>>> images are all failing on 32-bit VMs/machines. Looking at the first
>>>>>> instance of the oops, it seems to be a bad page state where a page is
>>>>>> still charged to a group and it is trying to be freed. The oops
>>>>>> output is below.
>>>>>>
>>>>>> Has anyone seen this in their 32-bit testing at all? Thus far nobody
>>>>>> can recreate this on a 64-bit machine/VM.
>>>>>>
>>>>>> josh
>>>>>>
>>>>>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1247382
>>>>>>
>>>>>> [ 9.026738] systemd[1]: Switching root.
>>>>>> [ 9.036467] systemd-journald[149]: Received SIGTERM from PID 1 (systemd).
>>>>>> [ 9.082262] BUG: Bad page state in process kworker/u5:1 pfn:372ac
>>>>>> [ 9.083989] page:f3d32ae0 count:0 mapcount:0 mapping:f2252178 index:0x16a
>>>>>> [ 9.085755] flags: 0x40020021(locked|lru|mappedtodisk)
>>>>>> [ 9.087284] page dumped because: page still charged to cgroup
>>>>>> [ 9.088772] bad because of flags:
>>>>>> [ 9.089731] flags: 0x21(locked|lru)
>>>>>> [ 9.090818] page->mem_cgroup:f2c3e400
>>>>>
>>>>> It's also still locked and on the LRU. This page shouldn't have been
>>>>> freed.
>>>>>
>>>>>> [ 9.117848] Call Trace:
>>>>>> [ 9.118738] [<c0aa22c9>] dump_stack+0x41/0x52
>>>>>> [ 9.120034] [<c054e30a>] bad_page.part.80+0xaa/0x100
>>>>>> [ 9.121461] [<c054eea9>] free_pages_prepare+0x3b9/0x3f0
>>>>>> [ 9.122934] [<c054fae2>] free_hot_cold_page+0x22/0x160
>>>>>> [ 9.124400] [<c071a22f>] ? copy_to_iter+0x1af/0x2a0
>>>>>> [ 9.125750] [<c054c4a3>] ? mempool_free_slab+0x13/0x20
>>>>>> [ 9.126840] [<c054fc57>] __free_pages+0x37/0x50
>>>>>> [ 9.127849] [<c054c4fd>] mempool_free_pages+0xd/0x10
>>>>>> [ 9.128908] [<c054c8b6>] mempool_free+0x26/0x80
>>>>>> [ 9.129895] [<c06f77e6>] bounce_end_io+0x56/0x80
>>>>>
>>>>> The page state looks completely off for a bounce buffer page. Did
>>>>> somebody mess with a bounce bio's bv_page?
>>>>
>>>> Looks the page isn't touched in both lo_read_transfer() and
>>>> lo_read_simple().
>>>>
>>>> Maybe it is related with aa4d86163e4e(block: loop: switch to VFS ITER_BVEC),
>>>> or it might be helpful to run 'git bisect' if reverting aa4d86163e4e can't
>>>> fix the issue, suppose the issue can be reproduced easily.
>>>
>>> I can try reverting that and getting someone to test it. It is
>>> somewhat complicated by having to spin a new install ISO, so a report
>>> back will be somewhat delayed. In the meantime, I'm also asking
>>> people to track down the first kernel build that hits this, so
>>> hopefully that gives us more of a clue as well.

The revert of that patch did not fix the issue.

>>> It is odd that only 32-bit hits this issue though. At least from what
>>> we've seen thus far.
>>
>> Page bounce may be just valid on 32-bit, and I will try to find one ARM
>> box to see if it can be reproduced easily.
>>
>> BTW, are there any extra steps for reproducing the issue? Such as
>> cgroup operations?
>
> I'm not entirely sure what the install environment on the ISOs is
> doing, but nobody sees this issue with a kernel after install. Thus
> far recreate efforts have focused on recreating the install ISOs using
> various kernels. That is working, but I don't expect other people to
> easily be able to do that.
>
> Also, our primary tester seems to have narrowed it down to breaking
> somewhere between 4.1-rc5 (good) and 4.1-rc6 (bad). I'll be working
> with him today to isolate it further, but the commit you pointed out
> was in 4.1-rc1 and that worked. He still needs to test a 4.2-rc4
> kernel with it reverted, but so far it seems to be something else that
> came in with the 4.1 kernel.

After doing some RPM bisecting, we've narrowed it down to the
following commit range:

[jwboyer@vader linux]$ git log --pretty=oneline c2102f3d73d8..0f1e5b5d19f6
0f1e5b5d19f6c06fe2078f946377db9861f3910d Merge tag 'dm-4.1-fixes-3' of
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
1c220c69ce0dcc0f234a9f263ad9c0864f971852 dm: fix casting bug in dm_merge_bvec()
15b94a690470038aa08247eedbebbe7e2218d5ee dm: fix reload failure of 0
path multipath mapping on blk-mq devices
e5d8de32cc02a259e1a237ab57cba00f2930fa6a dm: fix false warning in
free_rq_clone() for unmapped requests
45714fbed4556149d7f1730f5bae74f81d5e2cd5 dm: requeue from blk-mq
dm_mq_queue_rq() using BLK_MQ_RQ_QUEUE_BUSY
4c6dd53dd3674c310d7379c6b3273daa9fd95c79 dm mpath: fix leak of
dm_mpath_io structure in blk-mq .queue_rq error path
3a1407559a593d4360af12dd2df5296bf8eb0d28 dm: fix NULL pointer when
clone_and_map_rq returns !DM_MAPIO_REMAPPED
4ae9944d132b160d444fa3aa875307eb0fa3eeec dm: run queue on re-queue
[jwboyer@vader linux]$

It is interesting to note that we're also carrying a patch in our 4.1
kernel for loop performance reasons that went into upstream 4.2. That
patch is blk-loop-avoid-too-many-pending-per-work-IO.patch which
corresponds to upstream commit
4d4e41aef9429872ea3b105e83426941f7185ab6. All of those commits are in
4.2-rcX, which matches the failures we're seeing.

We can try a 4.1-rc5 snapshot build without the block patch to see if
that helps, but the patch was included in all the previously tested
good kernels and the issue only appeared after the DM merge commits
were included.

josh

2015-07-31 00:19:11

by Mike Snitzer

[permalink] [raw]
Subject: Re: cgroup/loop Bad page state oops in Linux v4.2-rc3-136-g45b4b782e848

On Thu, Jul 30 2015 at 7:14pm -0400,
Josh Boyer <[email protected]> wrote:

> On Thu, Jul 30, 2015 at 7:27 AM, Josh Boyer <[email protected]> wrote:
> > On Wed, Jul 29, 2015 at 8:29 PM, Ming Lei <[email protected]> wrote:
> >> On Wed, Jul 29, 2015 at 12:36 PM, Josh Boyer <[email protected]> wrote:
> >>> On Wed, Jul 29, 2015 at 11:32 AM, Ming Lei <[email protected]> wrote:
> >>>> On Wed, Jul 29, 2015 at 9:51 AM, Johannes Weiner <[email protected]> wrote:
> >>>>> On Wed, Jul 29, 2015 at 09:27:16AM -0400, Josh Boyer wrote:
> >>>>>> Hi All,
> >>>>>>
> >>>>>> We've gotten a report[1] that any of the upcoming Fedora 23 install
> >>>>>> images are all failing on 32-bit VMs/machines. Looking at the first
> >>>>>> instance of the oops, it seems to be a bad page state where a page is
> >>>>>> still charged to a group and it is trying to be freed. The oops
> >>>>>> output is below.
> >>>>>>
> >>>>>> Has anyone seen this in their 32-bit testing at all? Thus far nobody
> >>>>>> can recreate this on a 64-bit machine/VM.
> >>>>>>
> >>>>>> josh
> >>>>>>
> >>>>>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1247382
> >>>>>>
> >>>>>> [ 9.026738] systemd[1]: Switching root.
> >>>>>> [ 9.036467] systemd-journald[149]: Received SIGTERM from PID 1 (systemd).
> >>>>>> [ 9.082262] BUG: Bad page state in process kworker/u5:1 pfn:372ac
> >>>>>> [ 9.083989] page:f3d32ae0 count:0 mapcount:0 mapping:f2252178 index:0x16a
> >>>>>> [ 9.085755] flags: 0x40020021(locked|lru|mappedtodisk)
> >>>>>> [ 9.087284] page dumped because: page still charged to cgroup
> >>>>>> [ 9.088772] bad because of flags:
> >>>>>> [ 9.089731] flags: 0x21(locked|lru)
> >>>>>> [ 9.090818] page->mem_cgroup:f2c3e400
> >>>>>
> >>>>> It's also still locked and on the LRU. This page shouldn't have been
> >>>>> freed.
> >>>>>
> >>>>>> [ 9.117848] Call Trace:
> >>>>>> [ 9.118738] [<c0aa22c9>] dump_stack+0x41/0x52
> >>>>>> [ 9.120034] [<c054e30a>] bad_page.part.80+0xaa/0x100
> >>>>>> [ 9.121461] [<c054eea9>] free_pages_prepare+0x3b9/0x3f0
> >>>>>> [ 9.122934] [<c054fae2>] free_hot_cold_page+0x22/0x160
> >>>>>> [ 9.124400] [<c071a22f>] ? copy_to_iter+0x1af/0x2a0
> >>>>>> [ 9.125750] [<c054c4a3>] ? mempool_free_slab+0x13/0x20
> >>>>>> [ 9.126840] [<c054fc57>] __free_pages+0x37/0x50
> >>>>>> [ 9.127849] [<c054c4fd>] mempool_free_pages+0xd/0x10
> >>>>>> [ 9.128908] [<c054c8b6>] mempool_free+0x26/0x80
> >>>>>> [ 9.129895] [<c06f77e6>] bounce_end_io+0x56/0x80
> >>>>>
> >>>>> The page state looks completely off for a bounce buffer page. Did
> >>>>> somebody mess with a bounce bio's bv_page?
> >>>>
> >>>> Looks the page isn't touched in both lo_read_transfer() and
> >>>> lo_read_simple().
> >>>>
> >>>> Maybe it is related with aa4d86163e4e(block: loop: switch to VFS ITER_BVEC),
> >>>> or it might be helpful to run 'git bisect' if reverting aa4d86163e4e can't
> >>>> fix the issue, suppose the issue can be reproduced easily.
> >>>
> >>> I can try reverting that and getting someone to test it. It is
> >>> somewhat complicated by having to spin a new install ISO, so a report
> >>> back will be somewhat delayed. In the meantime, I'm also asking
> >>> people to track down the first kernel build that hits this, so
> >>> hopefully that gives us more of a clue as well.
>
> The revert of that patch did not fix the issue.
>
> >>> It is odd that only 32-bit hits this issue though. At least from what
> >>> we've seen thus far.
> >>
> >> Page bounce may be just valid on 32-bit, and I will try to find one ARM
> >> box to see if it can be reproduced easily.
> >>
> >> BTW, are there any extra steps for reproducing the issue? Such as
> >> cgroup operations?
> >
> > I'm not entirely sure what the install environment on the ISOs is
> > doing, but nobody sees this issue with a kernel after install. Thus
> > far recreate efforts have focused on recreating the install ISOs using
> > various kernels. That is working, but I don't expect other people to
> > easily be able to do that.
> >
> > Also, our primary tester seems to have narrowed it down to breaking
> > somewhere between 4.1-rc5 (good) and 4.1-rc6 (bad). I'll be working
> > with him today to isolate it further, but the commit you pointed out
> > was in 4.1-rc1 and that worked. He still needs to test a 4.2-rc4
> > kernel with it reverted, but so far it seems to be something else that
> > came in with the 4.1 kernel.
>
> After doing some RPM bisecting, we've narrowed it down to the
> following commit range:
>
> [jwboyer@vader linux]$ git log --pretty=oneline c2102f3d73d8..0f1e5b5d19f6
> 0f1e5b5d19f6c06fe2078f946377db9861f3910d Merge tag 'dm-4.1-fixes-3' of
> git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
> 1c220c69ce0dcc0f234a9f263ad9c0864f971852 dm: fix casting bug in dm_merge_bvec()
> 15b94a690470038aa08247eedbebbe7e2218d5ee dm: fix reload failure of 0
> path multipath mapping on blk-mq devices
> e5d8de32cc02a259e1a237ab57cba00f2930fa6a dm: fix false warning in
> free_rq_clone() for unmapped requests
> 45714fbed4556149d7f1730f5bae74f81d5e2cd5 dm: requeue from blk-mq
> dm_mq_queue_rq() using BLK_MQ_RQ_QUEUE_BUSY
> 4c6dd53dd3674c310d7379c6b3273daa9fd95c79 dm mpath: fix leak of
> dm_mpath_io structure in blk-mq .queue_rq error path
> 3a1407559a593d4360af12dd2df5296bf8eb0d28 dm: fix NULL pointer when
> clone_and_map_rq returns !DM_MAPIO_REMAPPED
> 4ae9944d132b160d444fa3aa875307eb0fa3eeec dm: run queue on re-queue
> [jwboyer@vader linux]$
>
> It is interesting to note that we're also carrying a patch in our 4.1
> kernel for loop performance reasons that went into upstream 4.2. That
> patch is blk-loop-avoid-too-many-pending-per-work-IO.patch which
> corresponds to upstream commit
> 4d4e41aef9429872ea3b105e83426941f7185ab6. All of those commits are in
> 4.2-rcX, which matches the failures we're seeing.
>
> We can try a 4.1-rc5 snapshot build without the block patch to see if
> that helps, but the patch was included in all the previously tested
> good kernels and the issue only appeared after the DM merge commits
> were included.

The only commit that looks even remotely related (given 32bit concerns)
would be 1c220c69ce0dcc0f234a9f263ad9c0864f971852

All the other DM commits are request-based changes that, AFAICT, aren't
applicable.

2015-07-31 18:58:07

by Josh Boyer

[permalink] [raw]
Subject: Re: cgroup/loop Bad page state oops in Linux v4.2-rc3-136-g45b4b782e848

On Thu, Jul 30, 2015 at 8:19 PM, Mike Snitzer <[email protected]> wrote:
> On Thu, Jul 30 2015 at 7:14pm -0400,
> Josh Boyer <[email protected]> wrote:
>
>> On Thu, Jul 30, 2015 at 7:27 AM, Josh Boyer <[email protected]> wrote:
>> > On Wed, Jul 29, 2015 at 8:29 PM, Ming Lei <[email protected]> wrote:
>> >> On Wed, Jul 29, 2015 at 12:36 PM, Josh Boyer <[email protected]> wrote:
>> >>> On Wed, Jul 29, 2015 at 11:32 AM, Ming Lei <[email protected]> wrote:
>> >>>> On Wed, Jul 29, 2015 at 9:51 AM, Johannes Weiner <[email protected]> wrote:
>> >>>>> On Wed, Jul 29, 2015 at 09:27:16AM -0400, Josh Boyer wrote:
>> >>>>>> Hi All,
>> >>>>>>
>> >>>>>> We've gotten a report[1] that any of the upcoming Fedora 23 install
>> >>>>>> images are all failing on 32-bit VMs/machines. Looking at the first
>> >>>>>> instance of the oops, it seems to be a bad page state where a page is
>> >>>>>> still charged to a group and it is trying to be freed. The oops
>> >>>>>> output is below.
>> >>>>>>
>> >>>>>> Has anyone seen this in their 32-bit testing at all? Thus far nobody
>> >>>>>> can recreate this on a 64-bit machine/VM.
>> >>>>>>
>> >>>>>> josh
>> >>>>>>
>> >>>>>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1247382
>> >>>>>>
>> >>>>>> [ 9.026738] systemd[1]: Switching root.
>> >>>>>> [ 9.036467] systemd-journald[149]: Received SIGTERM from PID 1 (systemd).
>> >>>>>> [ 9.082262] BUG: Bad page state in process kworker/u5:1 pfn:372ac
>> >>>>>> [ 9.083989] page:f3d32ae0 count:0 mapcount:0 mapping:f2252178 index:0x16a
>> >>>>>> [ 9.085755] flags: 0x40020021(locked|lru|mappedtodisk)
>> >>>>>> [ 9.087284] page dumped because: page still charged to cgroup
>> >>>>>> [ 9.088772] bad because of flags:
>> >>>>>> [ 9.089731] flags: 0x21(locked|lru)
>> >>>>>> [ 9.090818] page->mem_cgroup:f2c3e400
>> >>>>>
>> >>>>> It's also still locked and on the LRU. This page shouldn't have been
>> >>>>> freed.
>> >>>>>
>> >>>>>> [ 9.117848] Call Trace:
>> >>>>>> [ 9.118738] [<c0aa22c9>] dump_stack+0x41/0x52
>> >>>>>> [ 9.120034] [<c054e30a>] bad_page.part.80+0xaa/0x100
>> >>>>>> [ 9.121461] [<c054eea9>] free_pages_prepare+0x3b9/0x3f0
>> >>>>>> [ 9.122934] [<c054fae2>] free_hot_cold_page+0x22/0x160
>> >>>>>> [ 9.124400] [<c071a22f>] ? copy_to_iter+0x1af/0x2a0
>> >>>>>> [ 9.125750] [<c054c4a3>] ? mempool_free_slab+0x13/0x20
>> >>>>>> [ 9.126840] [<c054fc57>] __free_pages+0x37/0x50
>> >>>>>> [ 9.127849] [<c054c4fd>] mempool_free_pages+0xd/0x10
>> >>>>>> [ 9.128908] [<c054c8b6>] mempool_free+0x26/0x80
>> >>>>>> [ 9.129895] [<c06f77e6>] bounce_end_io+0x56/0x80
>> >>>>>
>> >>>>> The page state looks completely off for a bounce buffer page. Did
>> >>>>> somebody mess with a bounce bio's bv_page?
>> >>>>
>> >>>> Looks the page isn't touched in both lo_read_transfer() and
>> >>>> lo_read_simple().
>> >>>>
>> >>>> Maybe it is related with aa4d86163e4e(block: loop: switch to VFS ITER_BVEC),
>> >>>> or it might be helpful to run 'git bisect' if reverting aa4d86163e4e can't
>> >>>> fix the issue, suppose the issue can be reproduced easily.
>> >>>
>> >>> I can try reverting that and getting someone to test it. It is
>> >>> somewhat complicated by having to spin a new install ISO, so a report
>> >>> back will be somewhat delayed. In the meantime, I'm also asking
>> >>> people to track down the first kernel build that hits this, so
>> >>> hopefully that gives us more of a clue as well.
>>
>> The revert of that patch did not fix the issue.
>>
>> >>> It is odd that only 32-bit hits this issue though. At least from what
>> >>> we've seen thus far.
>> >>
>> >> Page bounce may be just valid on 32-bit, and I will try to find one ARM
>> >> box to see if it can be reproduced easily.
>> >>
>> >> BTW, are there any extra steps for reproducing the issue? Such as
>> >> cgroup operations?
>> >
>> > I'm not entirely sure what the install environment on the ISOs is
>> > doing, but nobody sees this issue with a kernel after install. Thus
>> > far recreate efforts have focused on recreating the install ISOs using
>> > various kernels. That is working, but I don't expect other people to
>> > easily be able to do that.
>> >
>> > Also, our primary tester seems to have narrowed it down to breaking
>> > somewhere between 4.1-rc5 (good) and 4.1-rc6 (bad). I'll be working
>> > with him today to isolate it further, but the commit you pointed out
>> > was in 4.1-rc1 and that worked. He still needs to test a 4.2-rc4
>> > kernel with it reverted, but so far it seems to be something else that
>> > came in with the 4.1 kernel.
>>
>> After doing some RPM bisecting, we've narrowed it down to the
>> following commit range:
>>
>> [jwboyer@vader linux]$ git log --pretty=oneline c2102f3d73d8..0f1e5b5d19f6
>> 0f1e5b5d19f6c06fe2078f946377db9861f3910d Merge tag 'dm-4.1-fixes-3' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
>> 1c220c69ce0dcc0f234a9f263ad9c0864f971852 dm: fix casting bug in dm_merge_bvec()
>> 15b94a690470038aa08247eedbebbe7e2218d5ee dm: fix reload failure of 0
>> path multipath mapping on blk-mq devices
>> e5d8de32cc02a259e1a237ab57cba00f2930fa6a dm: fix false warning in
>> free_rq_clone() for unmapped requests
>> 45714fbed4556149d7f1730f5bae74f81d5e2cd5 dm: requeue from blk-mq
>> dm_mq_queue_rq() using BLK_MQ_RQ_QUEUE_BUSY
>> 4c6dd53dd3674c310d7379c6b3273daa9fd95c79 dm mpath: fix leak of
>> dm_mpath_io structure in blk-mq .queue_rq error path
>> 3a1407559a593d4360af12dd2df5296bf8eb0d28 dm: fix NULL pointer when
>> clone_and_map_rq returns !DM_MAPIO_REMAPPED
>> 4ae9944d132b160d444fa3aa875307eb0fa3eeec dm: run queue on re-queue
>> [jwboyer@vader linux]$
>>
>> It is interesting to note that we're also carrying a patch in our 4.1
>> kernel for loop performance reasons that went into upstream 4.2. That
>> patch is blk-loop-avoid-too-many-pending-per-work-IO.patch which
>> corresponds to upstream commit
>> 4d4e41aef9429872ea3b105e83426941f7185ab6. All of those commits are in
>> 4.2-rcX, which matches the failures we're seeing.
>>
>> We can try a 4.1-rc5 snapshot build without the block patch to see if
>> that helps, but the patch was included in all the previously tested
>> good kernels and the issue only appeared after the DM merge commits
>> were included.
>
> The only commit that looks even remotely related (given 32bit concerns)
> would be 1c220c69ce0dcc0f234a9f263ad9c0864f971852

Confirmed. I built kernels for our tester that started with the
working snapshot and applied the patches above one at a time. The
failing patch was the commit you suspected.

I can try and build a 4.2-rc4 kernel with that reverted, but it would
be good if someone could start thinking about how that could cause
this issue.

josh

2015-08-02 14:01:40

by Josh Boyer

[permalink] [raw]
Subject: Re: cgroup/loop Bad page state oops in Linux v4.2-rc3-136-g45b4b782e848

On Fri, Jul 31, 2015 at 2:58 PM, Josh Boyer <[email protected]> wrote:
> On Thu, Jul 30, 2015 at 8:19 PM, Mike Snitzer <[email protected]> wrote:
>> On Thu, Jul 30 2015 at 7:14pm -0400,
>> Josh Boyer <[email protected]> wrote:
>>
>>> On Thu, Jul 30, 2015 at 7:27 AM, Josh Boyer <[email protected]> wrote:
>>> > On Wed, Jul 29, 2015 at 8:29 PM, Ming Lei <[email protected]> wrote:
>>> >> On Wed, Jul 29, 2015 at 12:36 PM, Josh Boyer <[email protected]> wrote:
>>> >>> On Wed, Jul 29, 2015 at 11:32 AM, Ming Lei <[email protected]> wrote:
>>> >>>> On Wed, Jul 29, 2015 at 9:51 AM, Johannes Weiner <[email protected]> wrote:
>>> >>>>> On Wed, Jul 29, 2015 at 09:27:16AM -0400, Josh Boyer wrote:
>>> >>>>>> Hi All,
>>> >>>>>>
>>> >>>>>> We've gotten a report[1] that any of the upcoming Fedora 23 install
>>> >>>>>> images are all failing on 32-bit VMs/machines. Looking at the first
>>> >>>>>> instance of the oops, it seems to be a bad page state where a page is
>>> >>>>>> still charged to a group and it is trying to be freed. The oops
>>> >>>>>> output is below.
>>> >>>>>>
>>> >>>>>> Has anyone seen this in their 32-bit testing at all? Thus far nobody
>>> >>>>>> can recreate this on a 64-bit machine/VM.
>>> >>>>>>
>>> >>>>>> josh
>>> >>>>>>
>>> >>>>>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1247382
>>> >>>>>>
>>> >>>>>> [ 9.026738] systemd[1]: Switching root.
>>> >>>>>> [ 9.036467] systemd-journald[149]: Received SIGTERM from PID 1 (systemd).
>>> >>>>>> [ 9.082262] BUG: Bad page state in process kworker/u5:1 pfn:372ac
>>> >>>>>> [ 9.083989] page:f3d32ae0 count:0 mapcount:0 mapping:f2252178 index:0x16a
>>> >>>>>> [ 9.085755] flags: 0x40020021(locked|lru|mappedtodisk)
>>> >>>>>> [ 9.087284] page dumped because: page still charged to cgroup
>>> >>>>>> [ 9.088772] bad because of flags:
>>> >>>>>> [ 9.089731] flags: 0x21(locked|lru)
>>> >>>>>> [ 9.090818] page->mem_cgroup:f2c3e400
>>> >>>>>
>>> >>>>> It's also still locked and on the LRU. This page shouldn't have been
>>> >>>>> freed.
>>> >>>>>
>>> >>>>>> [ 9.117848] Call Trace:
>>> >>>>>> [ 9.118738] [<c0aa22c9>] dump_stack+0x41/0x52
>>> >>>>>> [ 9.120034] [<c054e30a>] bad_page.part.80+0xaa/0x100
>>> >>>>>> [ 9.121461] [<c054eea9>] free_pages_prepare+0x3b9/0x3f0
>>> >>>>>> [ 9.122934] [<c054fae2>] free_hot_cold_page+0x22/0x160
>>> >>>>>> [ 9.124400] [<c071a22f>] ? copy_to_iter+0x1af/0x2a0
>>> >>>>>> [ 9.125750] [<c054c4a3>] ? mempool_free_slab+0x13/0x20
>>> >>>>>> [ 9.126840] [<c054fc57>] __free_pages+0x37/0x50
>>> >>>>>> [ 9.127849] [<c054c4fd>] mempool_free_pages+0xd/0x10
>>> >>>>>> [ 9.128908] [<c054c8b6>] mempool_free+0x26/0x80
>>> >>>>>> [ 9.129895] [<c06f77e6>] bounce_end_io+0x56/0x80
>>> >>>>>
>>> >>>>> The page state looks completely off for a bounce buffer page. Did
>>> >>>>> somebody mess with a bounce bio's bv_page?
>>> >>>>
>>> >>>> Looks the page isn't touched in both lo_read_transfer() and
>>> >>>> lo_read_simple().
>>> >>>>
>>> >>>> Maybe it is related with aa4d86163e4e(block: loop: switch to VFS ITER_BVEC),
>>> >>>> or it might be helpful to run 'git bisect' if reverting aa4d86163e4e can't
>>> >>>> fix the issue, suppose the issue can be reproduced easily.
>>> >>>
>>> >>> I can try reverting that and getting someone to test it. It is
>>> >>> somewhat complicated by having to spin a new install ISO, so a report
>>> >>> back will be somewhat delayed. In the meantime, I'm also asking
>>> >>> people to track down the first kernel build that hits this, so
>>> >>> hopefully that gives us more of a clue as well.
>>>
>>> The revert of that patch did not fix the issue.
>>>
>>> >>> It is odd that only 32-bit hits this issue though. At least from what
>>> >>> we've seen thus far.
>>> >>
>>> >> Page bounce may be just valid on 32-bit, and I will try to find one ARM
>>> >> box to see if it can be reproduced easily.
>>> >>
>>> >> BTW, are there any extra steps for reproducing the issue? Such as
>>> >> cgroup operations?
>>> >
>>> > I'm not entirely sure what the install environment on the ISOs is
>>> > doing, but nobody sees this issue with a kernel after install. Thus
>>> > far recreate efforts have focused on recreating the install ISOs using
>>> > various kernels. That is working, but I don't expect other people to
>>> > easily be able to do that.
>>> >
>>> > Also, our primary tester seems to have narrowed it down to breaking
>>> > somewhere between 4.1-rc5 (good) and 4.1-rc6 (bad). I'll be working
>>> > with him today to isolate it further, but the commit you pointed out
>>> > was in 4.1-rc1 and that worked. He still needs to test a 4.2-rc4
>>> > kernel with it reverted, but so far it seems to be something else that
>>> > came in with the 4.1 kernel.
>>>
>>> After doing some RPM bisecting, we've narrowed it down to the
>>> following commit range:
>>>
>>> [jwboyer@vader linux]$ git log --pretty=oneline c2102f3d73d8..0f1e5b5d19f6
>>> 0f1e5b5d19f6c06fe2078f946377db9861f3910d Merge tag 'dm-4.1-fixes-3' of
>>> git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
>>> 1c220c69ce0dcc0f234a9f263ad9c0864f971852 dm: fix casting bug in dm_merge_bvec()
>>> 15b94a690470038aa08247eedbebbe7e2218d5ee dm: fix reload failure of 0
>>> path multipath mapping on blk-mq devices
>>> e5d8de32cc02a259e1a237ab57cba00f2930fa6a dm: fix false warning in
>>> free_rq_clone() for unmapped requests
>>> 45714fbed4556149d7f1730f5bae74f81d5e2cd5 dm: requeue from blk-mq
>>> dm_mq_queue_rq() using BLK_MQ_RQ_QUEUE_BUSY
>>> 4c6dd53dd3674c310d7379c6b3273daa9fd95c79 dm mpath: fix leak of
>>> dm_mpath_io structure in blk-mq .queue_rq error path
>>> 3a1407559a593d4360af12dd2df5296bf8eb0d28 dm: fix NULL pointer when
>>> clone_and_map_rq returns !DM_MAPIO_REMAPPED
>>> 4ae9944d132b160d444fa3aa875307eb0fa3eeec dm: run queue on re-queue
>>> [jwboyer@vader linux]$
>>>
>>> It is interesting to note that we're also carrying a patch in our 4.1
>>> kernel for loop performance reasons that went into upstream 4.2. That
>>> patch is blk-loop-avoid-too-many-pending-per-work-IO.patch which
>>> corresponds to upstream commit
>>> 4d4e41aef9429872ea3b105e83426941f7185ab6. All of those commits are in
>>> 4.2-rcX, which matches the failures we're seeing.
>>>
>>> We can try a 4.1-rc5 snapshot build without the block patch to see if
>>> that helps, but the patch was included in all the previously tested
>>> good kernels and the issue only appeared after the DM merge commits
>>> were included.
>>
>> The only commit that looks even remotely related (given 32bit concerns)
>> would be 1c220c69ce0dcc0f234a9f263ad9c0864f971852
>
> Confirmed. I built kernels for our tester that started with the
> working snapshot and applied the patches above one at a time. The
> failing patch was the commit you suspected.
>
> I can try and build a 4.2-rc4 kernel with that reverted, but it would
> be good if someone could start thinking about how that could cause
> this issue.

A revert on top of 4.2-rc4 booted. So this is currently causing
issues with upstream as well.

josh

2015-08-03 14:28:25

by Mike Snitzer

[permalink] [raw]
Subject: Re: cgroup/loop Bad page state oops in Linux v4.2-rc3-136-g45b4b782e848

On Sun, Aug 02 2015 at 10:01P -0400,
Josh Boyer <[email protected]> wrote:

> On Fri, Jul 31, 2015 at 2:58 PM, Josh Boyer <[email protected]> wrote:
> > On Thu, Jul 30, 2015 at 8:19 PM, Mike Snitzer <[email protected]> wrote:
> >>
> >> The only commit that looks even remotely related (given 32bit concerns)
> >> would be 1c220c69ce0dcc0f234a9f263ad9c0864f971852
> >
> > Confirmed. I built kernels for our tester that started with the
> > working snapshot and applied the patches above one at a time. The
> > failing patch was the commit you suspected.
> >
> > I can try and build a 4.2-rc4 kernel with that reverted, but it would
> > be good if someone could start thinking about how that could cause
> > this issue.
>
> A revert on top of 4.2-rc4 booted. So this is currently causing
> issues with upstream as well.

Hi Josh,

I've staged the following fix in linux-next (for 4.2-rc6 inclusion):
https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=for-next&id=76270d574acc897178a5c8be0bd2a743a77e4bac

Can you please verify that it works for your 32bit testcase against
4.2-rc4 (or rc5)?

Thanks.

From: Mike Snitzer <[email protected]>
Date: Mon, 3 Aug 2015 09:54:58 -0400
Subject: [PATCH] dm: fix dm_merge_bvec regression on 32 bit systems

A DM regression on 32 bit systems was reported against v4.2-rc3 here:
https://lkml.org/lkml/2015/7/29/401

Fix this by reverting both commit 1c220c69 ("dm: fix casting bug in
dm_merge_bvec()") and 148e51ba ("dm: improve documentation and code
clarity in dm_merge_bvec"). This combined revert is done to eliminate
the possibility of a partial revert in stable@ kernels.

In hindsight the correct fix, at the time 1c220c69 was applied to fix
the regression that 148e51ba introduced, should've been to simply revert
148e51ba.

Reported-by: Josh Boyer <[email protected]>
Acked-by: Joe Thornber <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>
Cc: [email protected] # 3.19+
---
drivers/md/dm.c | 27 ++++++++++-----------------
1 file changed, 10 insertions(+), 17 deletions(-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index ab37ae1..0d7ab20 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1729,7 +1729,8 @@ static int dm_merge_bvec(struct request_queue *q,
struct mapped_device *md = q->queuedata;
struct dm_table *map = dm_get_live_table_fast(md);
struct dm_target *ti;
- sector_t max_sectors, max_size = 0;
+ sector_t max_sectors;
+ int max_size = 0;

if (unlikely(!map))
goto out;
@@ -1742,18 +1743,10 @@ static int dm_merge_bvec(struct request_queue *q,
* Find maximum amount of I/O that won't need splitting
*/
max_sectors = min(max_io_len(bvm->bi_sector, ti),
- (sector_t) queue_max_sectors(q));
+ (sector_t) BIO_MAX_SECTORS);
max_size = (max_sectors << SECTOR_SHIFT) - bvm->bi_size;
-
- /*
- * FIXME: this stop-gap fix _must_ be cleaned up (by passing a sector_t
- * to the targets' merge function since it holds sectors not bytes).
- * Just doing this as an interim fix for stable@ because the more
- * comprehensive cleanup of switching to sector_t will impact every
- * DM target that implements a ->merge hook.
- */
- if (max_size > INT_MAX)
- max_size = INT_MAX;
+ if (max_size < 0)
+ max_size = 0;

/*
* merge_bvec_fn() returns number of bytes
@@ -1761,13 +1754,13 @@ static int dm_merge_bvec(struct request_queue *q,
* max is precomputed maximal io size
*/
if (max_size && ti->type->merge)
- max_size = ti->type->merge(ti, bvm, biovec, (int) max_size);
+ max_size = ti->type->merge(ti, bvm, biovec, max_size);
/*
* If the target doesn't support merge method and some of the devices
- * provided their merge_bvec method (we know this by looking for the
- * max_hw_sectors that dm_set_device_limits may set), then we can't
- * allow bios with multiple vector entries. So always set max_size
- * to 0, and the code below allows just one page.
+ * provided their merge_bvec method (we know this by looking at
+ * queue_max_hw_sectors), then we can't allow bios with multiple vector
+ * entries. So always set max_size to 0, and the code below allows
+ * just one page.
*/
else if (queue_max_hw_sectors(q) <= PAGE_SIZE >> 9)
max_size = 0;
--
2.3.2 (Apple Git-55)

2015-08-03 16:56:32

by Josh Boyer

[permalink] [raw]
Subject: Re: cgroup/loop Bad page state oops in Linux v4.2-rc3-136-g45b4b782e848

On Mon, Aug 3, 2015 at 10:28 AM, Mike Snitzer <[email protected]> wrote:
> On Sun, Aug 02 2015 at 10:01P -0400,
> Josh Boyer <[email protected]> wrote:
>
>> On Fri, Jul 31, 2015 at 2:58 PM, Josh Boyer <[email protected]> wrote:
>> > On Thu, Jul 30, 2015 at 8:19 PM, Mike Snitzer <[email protected]> wrote:
>> >>
>> >> The only commit that looks even remotely related (given 32bit concerns)
>> >> would be 1c220c69ce0dcc0f234a9f263ad9c0864f971852
>> >
>> > Confirmed. I built kernels for our tester that started with the
>> > working snapshot and applied the patches above one at a time. The
>> > failing patch was the commit you suspected.
>> >
>> > I can try and build a 4.2-rc4 kernel with that reverted, but it would
>> > be good if someone could start thinking about how that could cause
>> > this issue.
>>
>> A revert on top of 4.2-rc4 booted. So this is currently causing
>> issues with upstream as well.
>
> Hi Josh,
>
> I've staged the following fix in linux-next (for 4.2-rc6 inclusion):
> https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=for-next&id=76270d574acc897178a5c8be0bd2a743a77e4bac
>
> Can you please verify that it works for your 32bit testcase against
> 4.2-rc4 (or rc5)?

Sure, I'll get a kernel with this included spun up and ask Adam to test.

josh

> From: Mike Snitzer <[email protected]>
> Date: Mon, 3 Aug 2015 09:54:58 -0400
> Subject: [PATCH] dm: fix dm_merge_bvec regression on 32 bit systems
>
> A DM regression on 32 bit systems was reported against v4.2-rc3 here:
> https://lkml.org/lkml/2015/7/29/401
>
> Fix this by reverting both commit 1c220c69 ("dm: fix casting bug in
> dm_merge_bvec()") and 148e51ba ("dm: improve documentation and code
> clarity in dm_merge_bvec"). This combined revert is done to eliminate
> the possibility of a partial revert in stable@ kernels.
>
> In hindsight the correct fix, at the time 1c220c69 was applied to fix
> the regression that 148e51ba introduced, should've been to simply revert
> 148e51ba.
>
> Reported-by: Josh Boyer <[email protected]>
> Acked-by: Joe Thornber <[email protected]>
> Signed-off-by: Mike Snitzer <[email protected]>
> Cc: [email protected] # 3.19+
> ---
> drivers/md/dm.c | 27 ++++++++++-----------------
> 1 file changed, 10 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index ab37ae1..0d7ab20 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1729,7 +1729,8 @@ static int dm_merge_bvec(struct request_queue *q,
> struct mapped_device *md = q->queuedata;
> struct dm_table *map = dm_get_live_table_fast(md);
> struct dm_target *ti;
> - sector_t max_sectors, max_size = 0;
> + sector_t max_sectors;
> + int max_size = 0;
>
> if (unlikely(!map))
> goto out;
> @@ -1742,18 +1743,10 @@ static int dm_merge_bvec(struct request_queue *q,
> * Find maximum amount of I/O that won't need splitting
> */
> max_sectors = min(max_io_len(bvm->bi_sector, ti),
> - (sector_t) queue_max_sectors(q));
> + (sector_t) BIO_MAX_SECTORS);
> max_size = (max_sectors << SECTOR_SHIFT) - bvm->bi_size;
> -
> - /*
> - * FIXME: this stop-gap fix _must_ be cleaned up (by passing a sector_t
> - * to the targets' merge function since it holds sectors not bytes).
> - * Just doing this as an interim fix for stable@ because the more
> - * comprehensive cleanup of switching to sector_t will impact every
> - * DM target that implements a ->merge hook.
> - */
> - if (max_size > INT_MAX)
> - max_size = INT_MAX;
> + if (max_size < 0)
> + max_size = 0;
>
> /*
> * merge_bvec_fn() returns number of bytes
> @@ -1761,13 +1754,13 @@ static int dm_merge_bvec(struct request_queue *q,
> * max is precomputed maximal io size
> */
> if (max_size && ti->type->merge)
> - max_size = ti->type->merge(ti, bvm, biovec, (int) max_size);
> + max_size = ti->type->merge(ti, bvm, biovec, max_size);
> /*
> * If the target doesn't support merge method and some of the devices
> - * provided their merge_bvec method (we know this by looking for the
> - * max_hw_sectors that dm_set_device_limits may set), then we can't
> - * allow bios with multiple vector entries. So always set max_size
> - * to 0, and the code below allows just one page.
> + * provided their merge_bvec method (we know this by looking at
> + * queue_max_hw_sectors), then we can't allow bios with multiple vector
> + * entries. So always set max_size to 0, and the code below allows
> + * just one page.
> */
> else if (queue_max_hw_sectors(q) <= PAGE_SIZE >> 9)
> max_size = 0;
> --
> 2.3.2 (Apple Git-55)
>

2015-08-04 01:11:10

by Josh Boyer

[permalink] [raw]
Subject: Re: cgroup/loop Bad page state oops in Linux v4.2-rc3-136-g45b4b782e848

On Mon, Aug 3, 2015 at 12:56 PM, Josh Boyer <[email protected]> wrote:
> On Mon, Aug 3, 2015 at 10:28 AM, Mike Snitzer <[email protected]> wrote:
>> On Sun, Aug 02 2015 at 10:01P -0400,
>> Josh Boyer <[email protected]> wrote:
>>
>>> On Fri, Jul 31, 2015 at 2:58 PM, Josh Boyer <[email protected]> wrote:
>>> > On Thu, Jul 30, 2015 at 8:19 PM, Mike Snitzer <[email protected]> wrote:
>>> >>
>>> >> The only commit that looks even remotely related (given 32bit concerns)
>>> >> would be 1c220c69ce0dcc0f234a9f263ad9c0864f971852
>>> >
>>> > Confirmed. I built kernels for our tester that started with the
>>> > working snapshot and applied the patches above one at a time. The
>>> > failing patch was the commit you suspected.
>>> >
>>> > I can try and build a 4.2-rc4 kernel with that reverted, but it would
>>> > be good if someone could start thinking about how that could cause
>>> > this issue.
>>>
>>> A revert on top of 4.2-rc4 booted. So this is currently causing
>>> issues with upstream as well.
>>
>> Hi Josh,
>>
>> I've staged the following fix in linux-next (for 4.2-rc6 inclusion):
>> https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=for-next&id=76270d574acc897178a5c8be0bd2a743a77e4bac
>>
>> Can you please verify that it works for your 32bit testcase against
>> 4.2-rc4 (or rc5)?
>
> Sure, I'll get a kernel with this included spun up and ask Adam to test.

Adam tested this with success. If you're still collecting patch
metadata, adding:

Tested-by: Adam Williamson <[email protected]>

would be appreciated.

josh

>> From: Mike Snitzer <[email protected]>
>> Date: Mon, 3 Aug 2015 09:54:58 -0400
>> Subject: [PATCH] dm: fix dm_merge_bvec regression on 32 bit systems
>>
>> A DM regression on 32 bit systems was reported against v4.2-rc3 here:
>> https://lkml.org/lkml/2015/7/29/401
>>
>> Fix this by reverting both commit 1c220c69 ("dm: fix casting bug in
>> dm_merge_bvec()") and 148e51ba ("dm: improve documentation and code
>> clarity in dm_merge_bvec"). This combined revert is done to eliminate
>> the possibility of a partial revert in stable@ kernels.
>>
>> In hindsight the correct fix, at the time 1c220c69 was applied to fix
>> the regression that 148e51ba introduced, should've been to simply revert
>> 148e51ba.
>>
>> Reported-by: Josh Boyer <[email protected]>
>> Acked-by: Joe Thornber <[email protected]>
>> Signed-off-by: Mike Snitzer <[email protected]>
>> Cc: [email protected] # 3.19+
>> ---
>> drivers/md/dm.c | 27 ++++++++++-----------------
>> 1 file changed, 10 insertions(+), 17 deletions(-)
>>
>> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
>> index ab37ae1..0d7ab20 100644
>> --- a/drivers/md/dm.c
>> +++ b/drivers/md/dm.c
>> @@ -1729,7 +1729,8 @@ static int dm_merge_bvec(struct request_queue *q,
>> struct mapped_device *md = q->queuedata;
>> struct dm_table *map = dm_get_live_table_fast(md);
>> struct dm_target *ti;
>> - sector_t max_sectors, max_size = 0;
>> + sector_t max_sectors;
>> + int max_size = 0;
>>
>> if (unlikely(!map))
>> goto out;
>> @@ -1742,18 +1743,10 @@ static int dm_merge_bvec(struct request_queue *q,
>> * Find maximum amount of I/O that won't need splitting
>> */
>> max_sectors = min(max_io_len(bvm->bi_sector, ti),
>> - (sector_t) queue_max_sectors(q));
>> + (sector_t) BIO_MAX_SECTORS);
>> max_size = (max_sectors << SECTOR_SHIFT) - bvm->bi_size;
>> -
>> - /*
>> - * FIXME: this stop-gap fix _must_ be cleaned up (by passing a sector_t
>> - * to the targets' merge function since it holds sectors not bytes).
>> - * Just doing this as an interim fix for stable@ because the more
>> - * comprehensive cleanup of switching to sector_t will impact every
>> - * DM target that implements a ->merge hook.
>> - */
>> - if (max_size > INT_MAX)
>> - max_size = INT_MAX;
>> + if (max_size < 0)
>> + max_size = 0;
>>
>> /*
>> * merge_bvec_fn() returns number of bytes
>> @@ -1761,13 +1754,13 @@ static int dm_merge_bvec(struct request_queue *q,
>> * max is precomputed maximal io size
>> */
>> if (max_size && ti->type->merge)
>> - max_size = ti->type->merge(ti, bvm, biovec, (int) max_size);
>> + max_size = ti->type->merge(ti, bvm, biovec, max_size);
>> /*
>> * If the target doesn't support merge method and some of the devices
>> - * provided their merge_bvec method (we know this by looking for the
>> - * max_hw_sectors that dm_set_device_limits may set), then we can't
>> - * allow bios with multiple vector entries. So always set max_size
>> - * to 0, and the code below allows just one page.
>> + * provided their merge_bvec method (we know this by looking at
>> + * queue_max_hw_sectors), then we can't allow bios with multiple vector
>> + * entries. So always set max_size to 0, and the code below allows
>> + * just one page.
>> */
>> else if (queue_max_hw_sectors(q) <= PAGE_SIZE >> 9)
>> max_size = 0;
>> --
>> 2.3.2 (Apple Git-55)
>>