2020-01-07 12:48:20

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH] block: fix splitting segments

Hi,

On Sun, Dec 29, 2019 at 10:32:30AM +0800, Ming Lei wrote:
> There are two issues in get_max_segment_size():
>
> 1) the default segment boudary mask is bypassed, and some devices still
> require segment to not cross the default 4G boundary
>
> 2) the segment start address isn't taken into account when checking
> segment boundary limit
>
> Fixes the two issues.
>
> Fixes: dcebd755926b ("block: use bio_for_each_bvec() to compute multi-page bvec count")
> Signed-off-by: Ming Lei <[email protected]>

This patch, pushed into mainline as "block: fix splitting segments on
boundary masks", results in the following crash when booting 'versatilepb'
in qemu from disk. Bisect log is attached. Detailed log is at
https://kerneltests.org/builders/qemu-arm-master/builds/1410/steps/qemubuildcommand/logs/stdio

Guenter

---
Crash:

kernel BUG at block/bio.c:1885!
Internal error: Oops - BUG: 0 [#1] ARM
Modules linked in:
CPU: 0 PID: 1 Comm: init Not tainted 5.5.0-rc5 #1
Hardware name: ARM-Versatile (Device Tree Support)
PC is at bio_split+0x10c/0x15c
LR is at __blk_queue_split+0x378/0x628
...
[<c03b716c>] (bio_split) from [<c03c24c8>] (__blk_queue_split+0x378/0x628)
[<c03c24c8>] (__blk_queue_split) from [<c03c82cc>] (blk_mq_make_request+0x6c/0x7e4)
[<c03c82cc>] (blk_mq_make_request) from [<c03bc7d0>] (generic_make_request+0xec/0x340)
[<c03bc7d0>] (generic_make_request) from [<c03bca70>] (submit_bio+0x4c/0x170)
[<c03bca70>] (submit_bio) from [<c020666c>] (ext4_mpage_readpages+0x54c/0x8e0)
[<c020666c>] (ext4_mpage_readpages) from [<c01e1ec8>] (ext4_readpages+0x40/0x50)
[<c01e1ec8>] (ext4_readpages) from [<c00df808>] (read_pages+0x50/0x13c)
[<c00df808>] (read_pages) from [<c00dfd2c>] (__do_page_cache_readahead+0x1a8/0x1f0)
[<c00dfd2c>] (__do_page_cache_readahead) from [<c00d500c>] (filemap_fault+0x440/0x8f4)
[<c00d500c>] (filemap_fault) from [<c01ed494>] (ext4_filemap_fault+0x28/0x3c)
[<c01ed494>] (ext4_filemap_fault) from [<c0100b88>] (__do_fault+0x3c/0x1c0)
[<c0100b88>] (__do_fault) from [<c0105360>] (handle_mm_fault+0x284/0xaf4)
[<c0105360>] (handle_mm_fault) from [<c001f01c>] (do_page_fault+0x114/0x2e0)
[<c001f01c>] (do_page_fault) from [<c001f34c>] (do_DataAbort+0x38/0xbc)
[<c001f34c>] (do_DataAbort) from [<c000a0dc>] (__dabt_svc+0x5c/0xa0)
Exception stack(0xc783be28 to 0xc783be70)
be20: b6f81898 00000760 00000000 00000055 00000051 c0af3068
be40: c71f7a00 c71f7800 b6f51000 c71df320 c7954c80 00000006 00000000 c783be78
be60: c01a099c c07a4bb4 20000153 ffffffff
[<c000a0dc>] (__dabt_svc) from [<c07a4bb4>] (__clear_user_std+0x34/0x68)
[<c07a4bb4>] (__clear_user_std) from [<c01a099c>] (clear_user+0x40/0x50)
[<c01a099c>] (clear_user) from [<c019f204>] (load_elf_binary+0x1354/0x13c4)
[<c019f204>] (load_elf_binary) from [<c013996c>] (search_binary_handler.part.4+0x58/0x1fc)
[<c013996c>] (search_binary_handler.part.4) from [<c013b18c>] (__do_execve_file+0x780/0x9a4)
[<c013b18c>] (__do_execve_file) from [<c013b54c>] (do_execve+0x28/0x30)
[<c013b54c>] (do_execve) from [<c000af1c>] (try_to_run_init_process+0xc/0x3c)
[<c000af1c>] (try_to_run_init_process) from [<c07c42fc>] (kernel_init+0x88/0xf0)
[<c07c42fc>] (kernel_init) from [<c00090b0>] (ret_from_fork+0x14/0x24)
...
WARNING: CPU: 0 PID: 1 at kernel/exit.c:719 do_exit+0x54/0xb5c
Modules linked in:
CPU: 0 PID: 1 Comm: init Tainted: G D 5.5.0-rc5 #1
Hardware name: ARM-Versatile (Device Tree Support)
[<c001e8b0>] (unwind_backtrace) from [<c001a774>] (show_stack+0x10/0x14)
[<c001a774>] (show_stack) from [<c0027dc4>] (__warn+0xe4/0x108)
[<c0027dc4>] (__warn) from [<c0027e90>] (warn_slowpath_fmt+0xa8/0xb8)
[<c0027e90>] (warn_slowpath_fmt) from [<c0029dfc>] (do_exit+0x54/0xb5c)
[<c0029dfc>] (do_exit) from [<c001a918>] (die+0x1a0/0x274)
[<c001a918>] (die) from [<c001abfc>] (do_undefinstr+0xac/0x258)
[<c001abfc>] (do_undefinstr) from [<c000a238>] (__und_svc_finish+0x0/0x48)
Exception stack(0xc783b950 to 0xc783b998)
b940: c7226a80 00000000 00000c00 c7bebe00
b960: 00000000 c783b9e4 00000000 c7226a80 00007000 00000060 c7beb848 c783b9f0
b980: 00000060 c783b9a0 c03c24c8 c03b716c 60000153 ffffffff
[<c000a238>] (__und_svc_finish) from [<c03b716c>] (bio_split+0x10c/0x15c)
[<c03b716c>] (bio_split) from [<c03c24c8>] (__blk_queue_split+0x378/0x628)
[<c03c24c8>] (__blk_queue_split) from [<c03c82cc>] (blk_mq_make_request+0x6c/0x7e4)
[<c03c82cc>] (blk_mq_make_request) from [<c03bc7d0>] (generic_make_request+0xec/0x340)
[<c03bc7d0>] (generic_make_request) from [<c03bca70>] (submit_bio+0x4c/0x170)
[<c03bca70>] (submit_bio) from [<c020666c>] (ext4_mpage_readpages+0x54c/0x8e0)
[<c020666c>] (ext4_mpage_readpages) from [<c01e1ec8>] (ext4_readpages+0x40/0x50)
[<c01e1ec8>] (ext4_readpages) from [<c00df808>] (read_pages+0x50/0x13c)
[<c00df808>] (read_pages) from [<c00dfd2c>] (__do_page_cache_readahead+0x1a8/0x1f0)
[<c00dfd2c>] (__do_page_cache_readahead) from [<c00d500c>] (filemap_fault+0x440/0x8f4)
[<c00d500c>] (filemap_fault) from [<c01ed494>] (ext4_filemap_fault+0x28/0x3c)
[<c01ed494>] (ext4_filemap_fault) from [<c0100b88>] (__do_fault+0x3c/0x1c0)
[<c0100b88>] (__do_fault) from [<c0105360>] (handle_mm_fault+0x284/0xaf4)
[<c0105360>] (handle_mm_fault) from [<c001f01c>] (do_page_fault+0x114/0x2e0)
[<c001f01c>] (do_page_fault) from [<c001f34c>] (do_DataAbort+0x38/0xbc)
[<c001f34c>] (do_DataAbort) from [<c000a0dc>] (__dabt_svc+0x5c/0xa0)
Exception stack(0xc783be28 to 0xc783be70)
be20: b6f81898 00000760 00000000 00000055 00000051 c0af3068
be40: c71f7a00 c71f7800 b6f51000 c71df320 c7954c80 00000006 00000000 c783be78
be60: c01a099c c07a4bb4 20000153 ffffffff
[<c000a0dc>] (__dabt_svc) from [<c07a4bb4>] (__clear_user_std+0x34/0x68)
[<c07a4bb4>] (__clear_user_std) from [<c01a099c>] (clear_user+0x40/0x50)
[<c01a099c>] (clear_user) from [<c019f204>] (load_elf_binary+0x1354/0x13c4)
[<c019f204>] (load_elf_binary) from [<c013996c>] (search_binary_handler.part.4+0x58/0x1fc)
[<c013996c>] (search_binary_handler.part.4) from [<c013b18c>] (__do_execve_file+0x780/0x9a4)
[<c013b18c>] (__do_execve_file) from [<c013b54c>] (do_execve+0x28/0x30)
[<c013b54c>] (do_execve) from [<c000af1c>] (try_to_run_init_process+0xc/0x3c)
[<c000af1c>] (try_to_run_init_process) from [<c07c42fc>] (kernel_init+0x88/0xf0)
[<c07c42fc>] (kernel_init) from [<c00090b0>] (ret_from_fork+0x14/0x24)
Exception stack(0xc783bfb0 to 0xc783bff8)
bfa0: 00000000 00000000 00000000 00000000
bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
irq event stamp: 159876
hardirqs last enabled at (159875): [<c009123c>] ktime_get+0x74/0x16c
hardirqs last disabled at (159876): [<c000a21c>] __und_svc+0x5c/0x70
softirqs last enabled at (159728): [<c000aaf0>] __do_softirq+0x308/0x4bc
softirqs last disabled at (159697): [<c002cf94>] irq_exit+0x150/0x178
---[ end trace 42dd349d5c0726c1 ]---
BUG: sleeping function called from invalid context at ./include/linux/cgroup-defs.h:747
in_atomic(): 0, irqs_disabled(): 128, non_block: 0, pid: 1, name: init
INFO: lockdep is turned off.
irq event stamp: 159876
hardirqs last enabled at (159875): [<c009123c>] ktime_get+0x74/0x16c
hardirqs last disabled at (159876): [<c000a21c>] __und_svc+0x5c/0x70
softirqs last enabled at (159728): [<c000aaf0>] __do_softirq+0x308/0x4bc
softirqs last disabled at (159697): [<c002cf94>] irq_exit+0x150/0x178
CPU: 0 PID: 1 Comm: init Tainted: G D W 5.5.0-rc5 #1
Hardware name: ARM-Versatile (Device Tree Support)
[<c001e8b0>] (unwind_backtrace) from [<c001a774>] (show_stack+0x10/0x14)
[<c001a774>] (show_stack) from [<c004fc4c>] (___might_sleep+0x1a8/0x2bc)
[<c004fc4c>] (___might_sleep) from [<c00380d0>] (exit_signals+0x28/0x134)
[<c00380d0>] (exit_signals) from [<c0029e70>] (do_exit+0xc8/0xb5c)
[<c0029e70>] (do_exit) from [<c001a918>] (die+0x1a0/0x274)
[<c001a918>] (die) from [<c001abfc>] (do_undefinstr+0xac/0x258)
[<c001abfc>] (do_undefinstr) from [<c000a238>] (__und_svc_finish+0x0/0x48)
Exception stack(0xc783b950 to 0xc783b998)
b940: c7226a80 00000000 00000c00 c7bebe00
b960: 00000000 c783b9e4 00000000 c7226a80 00007000 00000060 c7beb848 c783b9f0
b980: 00000060 c783b9a0 c03c24c8 c03b716c 60000153 ffffffff
[<c000a238>] (__und_svc_finish) from [<c03b716c>] (bio_split+0x10c/0x15c)
[<c03b716c>] (bio_split) from [<c03c24c8>] (__blk_queue_split+0x378/0x628)
[<c03c24c8>] (__blk_queue_split) from [<c03c82cc>] (blk_mq_make_request+0x6c/0x7e4)
[<c03c82cc>] (blk_mq_make_request) from [<c03bc7d0>] (generic_make_request+0xec/0x340)
[<c03bc7d0>] (generic_make_request) from [<c03bca70>] (submit_bio+0x4c/0x170)
[<c03bca70>] (submit_bio) from [<c020666c>] (ext4_mpage_readpages+0x54c/0x8e0)
[<c020666c>] (ext4_mpage_readpages) from [<c01e1ec8>] (ext4_readpages+0x40/0x50)
[<c01e1ec8>] (ext4_readpages) from [<c00df808>] (read_pages+0x50/0x13c)
[<c00df808>] (read_pages) from [<c00dfd2c>] (__do_page_cache_readahead+0x1a8/0x1f0)
[<c00dfd2c>] (__do_page_cache_readahead) from [<c00d500c>] (filemap_fault+0x440/0x8f4)
[<c00d500c>] (filemap_fault) from [<c01ed494>] (ext4_filemap_fault+0x28/0x3c)
[<c01ed494>] (ext4_filemap_fault) from [<c0100b88>] (__do_fault+0x3c/0x1c0)
[<c0100b88>] (__do_fault) from [<c0105360>] (handle_mm_fault+0x284/0xaf4)
[<c0105360>] (handle_mm_fault) from [<c001f01c>] (do_page_fault+0x114/0x2e0)
[<c001f01c>] (do_page_fault) from [<c001f34c>] (do_DataAbort+0x38/0xbc)
[<c001f34c>] (do_DataAbort) from [<c000a0dc>] (__dabt_svc+0x5c/0xa0)
Exception stack(0xc783be28 to 0xc783be70)
be20: b6f81898 00000760 00000000 00000055 00000051 c0af3068
be40: c71f7a00 c71f7800 b6f51000 c71df320 c7954c80 00000006 00000000 c783be78
be60: c01a099c c07a4bb4 20000153 ffffffff
[<c000a0dc>] (__dabt_svc) from [<c07a4bb4>] (__clear_user_std+0x34/0x68)
[<c07a4bb4>] (__clear_user_std) from [<c01a099c>] (clear_user+0x40/0x50)
[<c01a099c>] (clear_user) from [<c019f204>] (load_elf_binary+0x1354/0x13c4)
[<c019f204>] (load_elf_binary) from [<c013996c>] (search_binary_handler.part.4+0x58/0x1fc)
[<c013996c>] (search_binary_handler.part.4) from [<c013b18c>] (__do_execve_file+0x780/0x9a4)
[<c013b18c>] (__do_execve_file) from [<c013b54c>] (do_execve+0x28/0x30)
[<c013b54c>] (do_execve) from [<c000af1c>] (try_to_run_init_process+0xc/0x3c)
[<c000af1c>] (try_to_run_init_process) from [<c07c42fc>] (kernel_init+0x88/0xf0)
[<c07c42fc>] (kernel_init) from [<c00090b0>] (ret_from_fork+0x14/0x24)
Exception stack(0xc783bfb0 to 0xc783bff8)
bfa0: 00000000 00000000 00000000 00000000
bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

---
bisect log:

# bad: [ae6088216ce4b99b3a4aaaccd2eb2dd40d473d42] Merge tag 'trace-v5.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
# good: [738d2902773e30939a982c8df7a7f94293659810] Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
git bisect start 'HEAD' '738d2902773e'
# bad: [84029fd04c201a4c7e0b07ba262664900f47c6f5] memcg: account security cred as well to kmemcg
git bisect bad 84029fd04c201a4c7e0b07ba262664900f47c6f5
# good: [e35d0165908ad2d2bdb76773ef77b551763eedbd] Merge tag 'sound-5.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
git bisect good e35d0165908ad2d2bdb76773ef77b551763eedbd
# bad: [3a562aee727a7bfbb3a37b1aa934118397dad701] Merge tag 'for-5.5-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
git bisect bad 3a562aee727a7bfbb3a37b1aa934118397dad701
# good: [bed723519a72c0f68fbfaf68ed5bf55d04e46566] Merge tag 'kbuild-fixes-v5.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
git bisect good bed723519a72c0f68fbfaf68ed5bf55d04e46566
# bad: [b6b4aafc99d7c8dbf7d9429bf054b591daab1ad0] Merge tag 'block-5.5-20200103' of git://git.kernel.dk/linux-block
git bisect bad b6b4aafc99d7c8dbf7d9429bf054b591daab1ad0
# bad: [429120f3df2dba2bf3a4a19f4212a53ecefc7102] block: fix splitting segments on boundary masks
git bisect bad 429120f3df2dba2bf3a4a19f4212a53ecefc7102
# good: [85a8ce62c2eabe28b9d76ca4eecf37922402df93] block: add bio_truncate to fix guard_bio_eod
git bisect good 85a8ce62c2eabe28b9d76ca4eecf37922402df93
# first bad commit: [429120f3df2dba2bf3a4a19f4212a53ecefc7102] block: fix splitting segments on boundary masks


2020-01-07 15:25:11

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH] block: fix splitting segments

On Tue, Jan 07, 2020 at 04:47:08AM -0800, Guenter Roeck wrote:
> Hi,
>
> On Sun, Dec 29, 2019 at 10:32:30AM +0800, Ming Lei wrote:
> > There are two issues in get_max_segment_size():
> >
> > 1) the default segment boudary mask is bypassed, and some devices still
> > require segment to not cross the default 4G boundary
> >
> > 2) the segment start address isn't taken into account when checking
> > segment boundary limit
> >
> > Fixes the two issues.
> >
> > Fixes: dcebd755926b ("block: use bio_for_each_bvec() to compute multi-page bvec count")
> > Signed-off-by: Ming Lei <[email protected]>
>
> This patch, pushed into mainline as "block: fix splitting segments on
> boundary masks", results in the following crash when booting 'versatilepb'
> in qemu from disk. Bisect log is attached. Detailed log is at
> https://kerneltests.org/builders/qemu-arm-master/builds/1410/steps/qemubuildcommand/logs/stdio
>
> Guenter
>
> ---
> Crash:
>
> kernel BUG at block/bio.c:1885!
> Internal error: Oops - BUG: 0 [#1] ARM

Please apply the following debug patch, and post the log.

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 347782a24a35..c4aa683a1c20 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -217,6 +217,33 @@ static bool bvec_split_segs(const struct request_queue *q,
return len > 0 || bv->bv_len > max_len;
}

+static void blk_dump_bio(struct request_queue *q, struct bio *bio,
+ unsigned secs)
+{
+ struct bvec_iter iter;
+ struct bio_vec bvec;
+ int i = 0;
+ unsigned sectors = 0;
+
+ printk("max_sectors %u max_segs %u max_seg_size %u mask %lx\n",
+ get_max_io_size(q, bio), queue_max_segments(q),
+ queue_max_segment_size(q), queue_segment_boundary(q));
+ printk("%p: %hx/%hx %llu %u, %u\n",
+ bio,
+ bio->bi_flags, bio->bi_opf,
+ (unsigned long long)bio->bi_iter.bi_sector,
+ bio->bi_iter.bi_size, secs);
+ bio_for_each_bvec(bvec, bio, iter) {
+ sectors += bvec.bv_len >> 9;
+ trace_printk("\t %d: %lu %u %u(%u)\n", i++,
+ (unsigned long)page_to_pfn(bvec.bv_page),
+ bvec.bv_offset,
+ bvec.bv_len, bvec.bv_len >> 12);
+ }
+ printk("\t total sectors %u\n", sectors);
+}
+
+
/**
* blk_bio_segment_split - split a bio in two bios
* @q: [in] request queue pointer
@@ -273,6 +300,9 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
return NULL;
split:
*segs = nsegs;
+
+ if (sectors <= 0 || sectors >= bio_sectors(bio))
+ blk_dump_bio(q, bio, sectors);
return bio_split(bio, sectors, GFP_NOIO, bs);
}


Thanks,
Ming

2020-01-07 18:14:19

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH] block: fix splitting segments

On Tue, Jan 07, 2020 at 11:23:39PM +0800, Ming Lei wrote:
> On Tue, Jan 07, 2020 at 04:47:08AM -0800, Guenter Roeck wrote:
> > Hi,
> >
> > On Sun, Dec 29, 2019 at 10:32:30AM +0800, Ming Lei wrote:
> > > There are two issues in get_max_segment_size():
> > >
> > > 1) the default segment boudary mask is bypassed, and some devices still
> > > require segment to not cross the default 4G boundary
> > >
> > > 2) the segment start address isn't taken into account when checking
> > > segment boundary limit
> > >
> > > Fixes the two issues.
> > >
> > > Fixes: dcebd755926b ("block: use bio_for_each_bvec() to compute multi-page bvec count")
> > > Signed-off-by: Ming Lei <[email protected]>
> >
> > This patch, pushed into mainline as "block: fix splitting segments on
> > boundary masks", results in the following crash when booting 'versatilepb'
> > in qemu from disk. Bisect log is attached. Detailed log is at
> > https://kerneltests.org/builders/qemu-arm-master/builds/1410/steps/qemubuildcommand/logs/stdio
> >
> > Guenter
> >
> > ---
> > Crash:
> >
> > kernel BUG at block/bio.c:1885!
> > Internal error: Oops - BUG: 0 [#1] ARM
>
> Please apply the following debug patch, and post the log.
>

Here you are:

max_sectors 2560 max_segs 96 max_seg_size 65536 mask ffffffff
c738da80: 8c80/0 2416 28672, 0
total sectors 56

(I replaced %p with %px).

Guenter

> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index 347782a24a35..c4aa683a1c20 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -217,6 +217,33 @@ static bool bvec_split_segs(const struct request_queue *q,
> return len > 0 || bv->bv_len > max_len;
> }
>
> +static void blk_dump_bio(struct request_queue *q, struct bio *bio,
> + unsigned secs)
> +{
> + struct bvec_iter iter;
> + struct bio_vec bvec;
> + int i = 0;
> + unsigned sectors = 0;
> +
> + printk("max_sectors %u max_segs %u max_seg_size %u mask %lx\n",
> + get_max_io_size(q, bio), queue_max_segments(q),
> + queue_max_segment_size(q), queue_segment_boundary(q));
> + printk("%p: %hx/%hx %llu %u, %u\n",
> + bio,
> + bio->bi_flags, bio->bi_opf,
> + (unsigned long long)bio->bi_iter.bi_sector,
> + bio->bi_iter.bi_size, secs);
> + bio_for_each_bvec(bvec, bio, iter) {
> + sectors += bvec.bv_len >> 9;
> + trace_printk("\t %d: %lu %u %u(%u)\n", i++,
> + (unsigned long)page_to_pfn(bvec.bv_page),
> + bvec.bv_offset,
> + bvec.bv_len, bvec.bv_len >> 12);
> + }
> + printk("\t total sectors %u\n", sectors);
> +}
> +
> +
> /**
> * blk_bio_segment_split - split a bio in two bios
> * @q: [in] request queue pointer
> @@ -273,6 +300,9 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
> return NULL;
> split:
> *segs = nsegs;
> +
> + if (sectors <= 0 || sectors >= bio_sectors(bio))
> + blk_dump_bio(q, bio, sectors);
> return bio_split(bio, sectors, GFP_NOIO, bs);
> }
>
>
> Thanks,
> Ming
>

2020-01-07 22:34:01

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH] block: fix splitting segments

On Tue, Jan 07, 2020 at 10:11:45AM -0800, Guenter Roeck wrote:
> On Tue, Jan 07, 2020 at 11:23:39PM +0800, Ming Lei wrote:
> > On Tue, Jan 07, 2020 at 04:47:08AM -0800, Guenter Roeck wrote:
> > > Hi,
> > >
> > > On Sun, Dec 29, 2019 at 10:32:30AM +0800, Ming Lei wrote:
> > > > There are two issues in get_max_segment_size():
> > > >
> > > > 1) the default segment boudary mask is bypassed, and some devices still
> > > > require segment to not cross the default 4G boundary
> > > >
> > > > 2) the segment start address isn't taken into account when checking
> > > > segment boundary limit
> > > >
> > > > Fixes the two issues.
> > > >
> > > > Fixes: dcebd755926b ("block: use bio_for_each_bvec() to compute multi-page bvec count")
> > > > Signed-off-by: Ming Lei <[email protected]>
> > >
> > > This patch, pushed into mainline as "block: fix splitting segments on
> > > boundary masks", results in the following crash when booting 'versatilepb'
> > > in qemu from disk. Bisect log is attached. Detailed log is at
> > > https://kerneltests.org/builders/qemu-arm-master/builds/1410/steps/qemubuildcommand/logs/stdio
> > >
> > > Guenter
> > >
> > > ---
> > > Crash:
> > >
> > > kernel BUG at block/bio.c:1885!
> > > Internal error: Oops - BUG: 0 [#1] ARM
> >
> > Please apply the following debug patch, and post the log.
> >
>
> Here you are:
>
> max_sectors 2560 max_segs 96 max_seg_size 65536 mask ffffffff
> c738da80: 8c80/0 2416 28672, 0
> total sectors 56
>
> (I replaced %p with %px).
>

Please try the following patch and see if it makes a difference.
If not, replace trace_printk with printk in previous debug patch,
and apply the debug patch only & post the log.

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 347782a24a35..f152bdee9b05 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -159,12 +159,12 @@ static inline unsigned get_max_io_size(struct request_queue *q,

static inline unsigned get_max_segment_size(const struct request_queue *q,
struct page *start_page,
- unsigned long offset)
+ unsigned long long offset)
{
unsigned long mask = queue_segment_boundary(q);

offset = mask & (page_to_phys(start_page) + offset);
- return min_t(unsigned long, mask - offset + 1,
+ return min_t(unsigned long long, mask - offset + 1,
queue_max_segment_size(q));
}


Thanks,
Ming

2020-01-07 22:34:35

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] block: fix splitting segments

On 1/7/20 3:30 PM, Ming Lei wrote:
> On Tue, Jan 07, 2020 at 10:11:45AM -0800, Guenter Roeck wrote:
>> On Tue, Jan 07, 2020 at 11:23:39PM +0800, Ming Lei wrote:
>>> On Tue, Jan 07, 2020 at 04:47:08AM -0800, Guenter Roeck wrote:
>>>> Hi,
>>>>
>>>> On Sun, Dec 29, 2019 at 10:32:30AM +0800, Ming Lei wrote:
>>>>> There are two issues in get_max_segment_size():
>>>>>
>>>>> 1) the default segment boudary mask is bypassed, and some devices still
>>>>> require segment to not cross the default 4G boundary
>>>>>
>>>>> 2) the segment start address isn't taken into account when checking
>>>>> segment boundary limit
>>>>>
>>>>> Fixes the two issues.
>>>>>
>>>>> Fixes: dcebd755926b ("block: use bio_for_each_bvec() to compute multi-page bvec count")
>>>>> Signed-off-by: Ming Lei <[email protected]>
>>>>
>>>> This patch, pushed into mainline as "block: fix splitting segments on
>>>> boundary masks", results in the following crash when booting 'versatilepb'
>>>> in qemu from disk. Bisect log is attached. Detailed log is at
>>>> https://kerneltests.org/builders/qemu-arm-master/builds/1410/steps/qemubuildcommand/logs/stdio
>>>>
>>>> Guenter
>>>>
>>>> ---
>>>> Crash:
>>>>
>>>> kernel BUG at block/bio.c:1885!
>>>> Internal error: Oops - BUG: 0 [#1] ARM
>>>
>>> Please apply the following debug patch, and post the log.
>>>
>>
>> Here you are:
>>
>> max_sectors 2560 max_segs 96 max_seg_size 65536 mask ffffffff
>> c738da80: 8c80/0 2416 28672, 0
>> total sectors 56
>>
>> (I replaced %p with %px).
>>
>
> Please try the following patch and see if it makes a difference.
> If not, replace trace_printk with printk in previous debug patch,
> and apply the debug patch only & post the log.

If it is a 32-bit issue, then we should use a 64-bit type to make
this nicer than ULL. But it seems reasonable that it could be!

--
Jens Axboe

2020-01-07 23:07:43

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH] block: fix splitting segments

On Wed, Jan 08, 2020 at 06:30:35AM +0800, Ming Lei wrote:
> On Tue, Jan 07, 2020 at 10:11:45AM -0800, Guenter Roeck wrote:
> > On Tue, Jan 07, 2020 at 11:23:39PM +0800, Ming Lei wrote:
> > > On Tue, Jan 07, 2020 at 04:47:08AM -0800, Guenter Roeck wrote:
> > > > Hi,
> > > >
> > > > On Sun, Dec 29, 2019 at 10:32:30AM +0800, Ming Lei wrote:
> > > > > There are two issues in get_max_segment_size():
> > > > >
> > > > > 1) the default segment boudary mask is bypassed, and some devices still
> > > > > require segment to not cross the default 4G boundary
> > > > >
> > > > > 2) the segment start address isn't taken into account when checking
> > > > > segment boundary limit
> > > > >
> > > > > Fixes the two issues.
> > > > >
> > > > > Fixes: dcebd755926b ("block: use bio_for_each_bvec() to compute multi-page bvec count")
> > > > > Signed-off-by: Ming Lei <[email protected]>
> > > >
> > > > This patch, pushed into mainline as "block: fix splitting segments on
> > > > boundary masks", results in the following crash when booting 'versatilepb'
> > > > in qemu from disk. Bisect log is attached. Detailed log is at
> > > > https://kerneltests.org/builders/qemu-arm-master/builds/1410/steps/qemubuildcommand/logs/stdio
> > > >
> > > > Guenter
> > > >
> > > > ---
> > > > Crash:
> > > >
> > > > kernel BUG at block/bio.c:1885!
> > > > Internal error: Oops - BUG: 0 [#1] ARM
> > >
> > > Please apply the following debug patch, and post the log.
> > >
> >
> > Here you are:
> >
> > max_sectors 2560 max_segs 96 max_seg_size 65536 mask ffffffff
> > c738da80: 8c80/0 2416 28672, 0
> > total sectors 56
> >
> > (I replaced %p with %px).
> >
>
> Please try the following patch and see if it makes a difference.
> If not, replace trace_printk with printk in previous debug patch,
> and apply the debug patch only & post the log.
>

The patch below fixes the problem for me.

Tested-by: Guenter Roeck <[email protected]>

Guenter

> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index 347782a24a35..f152bdee9b05 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -159,12 +159,12 @@ static inline unsigned get_max_io_size(struct request_queue *q,
>
> static inline unsigned get_max_segment_size(const struct request_queue *q,
> struct page *start_page,
> - unsigned long offset)
> + unsigned long long offset)
> {
> unsigned long mask = queue_segment_boundary(q);
>
> offset = mask & (page_to_phys(start_page) + offset);
> - return min_t(unsigned long, mask - offset + 1,
> + return min_t(unsigned long long, mask - offset + 1,
> queue_max_segment_size(q));
> }
>
>
> Thanks,
> Ming
>

2020-01-08 02:00:31

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH] block: fix splitting segments

On Tue, Jan 07, 2020 at 03:32:58PM -0700, Jens Axboe wrote:
> On 1/7/20 3:30 PM, Ming Lei wrote:
> > On Tue, Jan 07, 2020 at 10:11:45AM -0800, Guenter Roeck wrote:
> >> On Tue, Jan 07, 2020 at 11:23:39PM +0800, Ming Lei wrote:
> >>> On Tue, Jan 07, 2020 at 04:47:08AM -0800, Guenter Roeck wrote:
> >>>> Hi,
> >>>>
> >>>> On Sun, Dec 29, 2019 at 10:32:30AM +0800, Ming Lei wrote:
> >>>>> There are two issues in get_max_segment_size():
> >>>>>
> >>>>> 1) the default segment boudary mask is bypassed, and some devices still
> >>>>> require segment to not cross the default 4G boundary
> >>>>>
> >>>>> 2) the segment start address isn't taken into account when checking
> >>>>> segment boundary limit
> >>>>>
> >>>>> Fixes the two issues.
> >>>>>
> >>>>> Fixes: dcebd755926b ("block: use bio_for_each_bvec() to compute multi-page bvec count")
> >>>>> Signed-off-by: Ming Lei <[email protected]>
> >>>>
> >>>> This patch, pushed into mainline as "block: fix splitting segments on
> >>>> boundary masks", results in the following crash when booting 'versatilepb'
> >>>> in qemu from disk. Bisect log is attached. Detailed log is at
> >>>> https://kerneltests.org/builders/qemu-arm-master/builds/1410/steps/qemubuildcommand/logs/stdio
> >>>>
> >>>> Guenter
> >>>>
> >>>> ---
> >>>> Crash:
> >>>>
> >>>> kernel BUG at block/bio.c:1885!
> >>>> Internal error: Oops - BUG: 0 [#1] ARM
> >>>
> >>> Please apply the following debug patch, and post the log.
> >>>
> >>
> >> Here you are:
> >>
> >> max_sectors 2560 max_segs 96 max_seg_size 65536 mask ffffffff
> >> c738da80: 8c80/0 2416 28672, 0
> >> total sectors 56
> >>
> >> (I replaced %p with %px).
> >>
> >
> > Please try the following patch and see if it makes a difference.
> > If not, replace trace_printk with printk in previous debug patch,
> > and apply the debug patch only & post the log.
>
> If it is a 32-bit issue, then we should use a 64-bit type to make
> this nicer than ULL. But it seems reasonable that it could be!

oops, just saw this email after sending out the patch.

Do you need V2 to change ULL to u64?

Thanks,
Ming

2020-01-08 02:37:12

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] block: fix splitting segments

On 1/7/20 6:59 PM, Ming Lei wrote:
> On Tue, Jan 07, 2020 at 03:32:58PM -0700, Jens Axboe wrote:
>> On 1/7/20 3:30 PM, Ming Lei wrote:
>>> On Tue, Jan 07, 2020 at 10:11:45AM -0800, Guenter Roeck wrote:
>>>> On Tue, Jan 07, 2020 at 11:23:39PM +0800, Ming Lei wrote:
>>>>> On Tue, Jan 07, 2020 at 04:47:08AM -0800, Guenter Roeck wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On Sun, Dec 29, 2019 at 10:32:30AM +0800, Ming Lei wrote:
>>>>>>> There are two issues in get_max_segment_size():
>>>>>>>
>>>>>>> 1) the default segment boudary mask is bypassed, and some devices still
>>>>>>> require segment to not cross the default 4G boundary
>>>>>>>
>>>>>>> 2) the segment start address isn't taken into account when checking
>>>>>>> segment boundary limit
>>>>>>>
>>>>>>> Fixes the two issues.
>>>>>>>
>>>>>>> Fixes: dcebd755926b ("block: use bio_for_each_bvec() to compute multi-page bvec count")
>>>>>>> Signed-off-by: Ming Lei <[email protected]>
>>>>>>
>>>>>> This patch, pushed into mainline as "block: fix splitting segments on
>>>>>> boundary masks", results in the following crash when booting 'versatilepb'
>>>>>> in qemu from disk. Bisect log is attached. Detailed log is at
>>>>>> https://kerneltests.org/builders/qemu-arm-master/builds/1410/steps/qemubuildcommand/logs/stdio
>>>>>>
>>>>>> Guenter
>>>>>>
>>>>>> ---
>>>>>> Crash:
>>>>>>
>>>>>> kernel BUG at block/bio.c:1885!
>>>>>> Internal error: Oops - BUG: 0 [#1] ARM
>>>>>
>>>>> Please apply the following debug patch, and post the log.
>>>>>
>>>>
>>>> Here you are:
>>>>
>>>> max_sectors 2560 max_segs 96 max_seg_size 65536 mask ffffffff
>>>> c738da80: 8c80/0 2416 28672, 0
>>>> total sectors 56
>>>>
>>>> (I replaced %p with %px).
>>>>
>>>
>>> Please try the following patch and see if it makes a difference.
>>> If not, replace trace_printk with printk in previous debug patch,
>>> and apply the debug patch only & post the log.
>>
>> If it is a 32-bit issue, then we should use a 64-bit type to make
>> this nicer than ULL. But it seems reasonable that it could be!
>
> oops, just saw this email after sending out the patch.
>
> Do you need V2 to change ULL to u64?

Nah, I can just edit it, that's fine.

--
Jens Axboe