2024-04-24 10:27:02

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH] block: allow device to have both virt_boundary_mask and max segment size

Hi Ming,

On Sun, 7 Apr 2024, Ming Lei wrote:
> When one stacking device is over one device with virt_boundary_mask and
> another one with max segment size, the stacking device have both limits
> set. This way is allowed before d690cb8ae14b ("block: add an API to
> atomically update queue limits").
>
> Relax the limit so that we won't break such kind of stacking setting.
>
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218687
> Reported-by: [email protected]
> Fixes: d690cb8ae14b ("block: add an API to atomically update queue limits")
> Link: https://lore.kernel.org/linux-block/ZfGl8HzUpiOxCLm3@fedora/
> Cc: Christoph Hellwig <[email protected]>
> Cc: Mike Snitzer <[email protected]>
> Cc: [email protected]
> Cc: Song Liu <[email protected]>
> Cc: [email protected]
> Signed-off-by: Ming Lei <[email protected]>

Thanks for your patch, which is now commit b561ea56a26415bf ("block:
allow device to have both virt_boundary_mask and max segment size") in
v6.9-rc4.

With CONFIG_DMA_API_DEBUG_SG=y and IOMMU support enabled, this causes a
warning on R-Car Gen3/Gen4 platforms:

DMA-API: renesas_sdhi_internal_dmac ee160000.mmc: mapping sg segment longer than device claims to support [len=86016] [max=65536]
WARNING: CPU: 1 PID: 281 at kernel/dma/debug.c:1178 debug_dma_map_sg+0x2ac/0x330
Modules linked in:
CPU: 1 PID: 281 Comm: systemd-udevd Tainted: G W 6.9.0-rc2-ebisu-00012-gb561ea56a264 #596
Hardware name: Renesas Ebisu board based on r8a77990 (DT)
pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : debug_dma_map_sg+0x2ac/0x330
lr : debug_dma_map_sg+0x2ac/0x330
sp : ffffffc083643470
x29: ffffffc083643470 x28: 0000000000000080 x27: 0000000000010000
x26: 0000000000000000 x25: 0000000000000001 x24: ffffffc0810afc30
x23: ffffffffffffffff x22: ffffffc080c8366f x21: ffffff8008849f80
x20: ffffff800cd24000 x19: ffffff80099a2810 x18: 0000000000000000
x17: ffffff800801a000 x16: ffffffc080453f00 x15: ffffffc0836430f0
x14: ffffffc08099fb50 x13: 0000000000000007 x12: 0000000000000000
x11: 0000000000000202 x10: ffffffc0810d99d0 x9 : ffffffc081189bb0
x8 : ffffffc083643178 x7 : ffffffc083643180 x6 : 00000000ffffdfff
x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffffff800ebad280
Call trace:
debug_dma_map_sg+0x2ac/0x330
__dma_map_sg_attrs+0xcc/0xd0
dma_map_sg_attrs+0xc/0x1c
renesas_sdhi_internal_dmac_map+0x64/0x94
renesas_sdhi_internal_dmac_pre_req+0x20/0x2c
mmc_blk_mq_issue_rq+0x62c/0x6c8
mmc_mq_queue_rq+0x194/0x218
blk_mq_dispatch_rq_list+0x36c/0x4d4
__blk_mq_sched_dispatch_requests+0x344/0x4e0
blk_mq_sched_dispatch_requests+0x28/0x5c
blk_mq_run_hw_queue+0x1a4/0x218
blk_mq_flush_plug_list+0x2fc/0x4a0
__blk_flush_plug+0x70/0x134
blk_finish_plug+0x24/0x34
read_pages+0x60/0x158
page_cache_ra_unbounded+0x98/0x184
do_page_cache_ra+0x44/0x50
force_page_cache_ra+0x98/0x9c
page_cache_sync_ra+0x30/0x54
filemap_get_pages+0xfc/0x4f8
filemap_read+0xe8/0x2b8
blkdev_read_iter+0x12c/0x144
vfs_read+0x104/0x150
ksys_read+0x6c/0xd4
__arm64_sys_read+0x14/0x1c
invoke_syscall+0x70/0xf4
el0_svc_common.constprop.0+0xb0/0xcc
do_el0_svc+0x1c/0x24
el0_svc+0x34/0x8c
el0t_64_sync_handler+0x88/0x124
el0t_64_sync+0x150/0x154
irq event stamp: 0
hardirqs last enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<ffffffc08007efa4>] copy_process+0x6ac/0x18d4
softirqs last enabled at (0): [<ffffffc08007efa4>] copy_process+0x6ac/0x18d4
softirqs last disabled at (0): [<0000000000000000>] 0x0

Reverting this commit, or disabling IOMMU support fixes the issue.

> --- a/block/blk-settings.c
> +++ b/block/blk-settings.c
> @@ -182,17 +182,13 @@ static int blk_validate_limits(struct queue_limits *lim)
> return -EINVAL;
>
> /*
> - * Devices that require a virtual boundary do not support scatter/gather
> - * I/O natively, but instead require a descriptor list entry for each
> - * page (which might not be identical to the Linux PAGE_SIZE). Because
> - * of that they are not limited by our notion of "segment size".
> + * Stacking device may have both virtual boundary and max segment
> + * size limit, so allow this setting now, and long-term the two
> + * might need to move out of stacking limits since we have immutable
> + * bvec and lower layer bio splitting is supposed to handle the two
> + * correctly.
> */
> - if (lim->virt_boundary_mask) {
> - if (WARN_ON_ONCE(lim->max_segment_size &&
> - lim->max_segment_size != UINT_MAX))
> - return -EINVAL;
> - lim->max_segment_size = UINT_MAX;
> - } else {
> + if (!lim->virt_boundary_mask) {
> /*
> * The maximum segment size has an odd historic 64k default that
> * drivers probably should override. Just like the I/O size we

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


2024-04-24 12:41:47

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH] block: allow device to have both virt_boundary_mask and max segment size

On Wed, Apr 24, 2024 at 12:26:26PM +0200, Geert Uytterhoeven wrote:
> Hi Ming,
>
> On Sun, 7 Apr 2024, Ming Lei wrote:
> > When one stacking device is over one device with virt_boundary_mask and
> > another one with max segment size, the stacking device have both limits
> > set. This way is allowed before d690cb8ae14b ("block: add an API to
> > atomically update queue limits").
> >
> > Relax the limit so that we won't break such kind of stacking setting.
> >
> > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218687
> > Reported-by: [email protected]
> > Fixes: d690cb8ae14b ("block: add an API to atomically update queue limits")
> > Link: https://lore.kernel.org/linux-block/ZfGl8HzUpiOxCLm3@fedora/
> > Cc: Christoph Hellwig <[email protected]>
> > Cc: Mike Snitzer <[email protected]>
> > Cc: [email protected]
> > Cc: Song Liu <[email protected]>
> > Cc: [email protected]
> > Signed-off-by: Ming Lei <[email protected]>
>
> Thanks for your patch, which is now commit b561ea56a26415bf ("block:
> allow device to have both virt_boundary_mask and max segment size") in
> v6.9-rc4.
>
> With CONFIG_DMA_API_DEBUG_SG=y and IOMMU support enabled, this causes a
> warning on R-Car Gen3/Gen4 platforms:
>
> DMA-API: renesas_sdhi_internal_dmac ee160000.mmc: mapping sg segment longer than device claims to support [len=86016] [max=65536]
> WARNING: CPU: 1 PID: 281 at kernel/dma/debug.c:1178 debug_dma_map_sg+0x2ac/0x330
> Modules linked in:
> CPU: 1 PID: 281 Comm: systemd-udevd Tainted: G W 6.9.0-rc2-ebisu-00012-gb561ea56a264 #596
> Hardware name: Renesas Ebisu board based on r8a77990 (DT)
> pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : debug_dma_map_sg+0x2ac/0x330
> lr : debug_dma_map_sg+0x2ac/0x330
> sp : ffffffc083643470
> x29: ffffffc083643470 x28: 0000000000000080 x27: 0000000000010000
> x26: 0000000000000000 x25: 0000000000000001 x24: ffffffc0810afc30
> x23: ffffffffffffffff x22: ffffffc080c8366f x21: ffffff8008849f80
> x20: ffffff800cd24000 x19: ffffff80099a2810 x18: 0000000000000000
> x17: ffffff800801a000 x16: ffffffc080453f00 x15: ffffffc0836430f0
> x14: ffffffc08099fb50 x13: 0000000000000007 x12: 0000000000000000
> x11: 0000000000000202 x10: ffffffc0810d99d0 x9 : ffffffc081189bb0
> x8 : ffffffc083643178 x7 : ffffffc083643180 x6 : 00000000ffffdfff
> x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
> x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffffff800ebad280
> Call trace:
> debug_dma_map_sg+0x2ac/0x330
> __dma_map_sg_attrs+0xcc/0xd0
> dma_map_sg_attrs+0xc/0x1c
> renesas_sdhi_internal_dmac_map+0x64/0x94
> renesas_sdhi_internal_dmac_pre_req+0x20/0x2c
> mmc_blk_mq_issue_rq+0x62c/0x6c8
> mmc_mq_queue_rq+0x194/0x218
> blk_mq_dispatch_rq_list+0x36c/0x4d4
> __blk_mq_sched_dispatch_requests+0x344/0x4e0
> blk_mq_sched_dispatch_requests+0x28/0x5c
> blk_mq_run_hw_queue+0x1a4/0x218
> blk_mq_flush_plug_list+0x2fc/0x4a0
> __blk_flush_plug+0x70/0x134
> blk_finish_plug+0x24/0x34
> read_pages+0x60/0x158
> page_cache_ra_unbounded+0x98/0x184
> do_page_cache_ra+0x44/0x50
> force_page_cache_ra+0x98/0x9c
> page_cache_sync_ra+0x30/0x54
> filemap_get_pages+0xfc/0x4f8
> filemap_read+0xe8/0x2b8
> blkdev_read_iter+0x12c/0x144
> vfs_read+0x104/0x150
> ksys_read+0x6c/0xd4
> __arm64_sys_read+0x14/0x1c
> invoke_syscall+0x70/0xf4
> el0_svc_common.constprop.0+0xb0/0xcc
> do_el0_svc+0x1c/0x24
> el0_svc+0x34/0x8c
> el0t_64_sync_handler+0x88/0x124
> el0t_64_sync+0x150/0x154
> irq event stamp: 0
> hardirqs last enabled at (0): [<0000000000000000>] 0x0
> hardirqs last disabled at (0): [<ffffffc08007efa4>] copy_process+0x6ac/0x18d4
> softirqs last enabled at (0): [<ffffffc08007efa4>] copy_process+0x6ac/0x18d4
> softirqs last disabled at (0): [<0000000000000000>] 0x0
>
> Reverting this commit, or disabling IOMMU support fixes the issue.

Hello Geert,

Can you test the following patch?


diff --git a/block/blk-settings.c b/block/blk-settings.c
index 8e1d7ed52fef..ebba05a2bc7f 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -188,7 +188,10 @@ static int blk_validate_limits(struct queue_limits *lim)
* bvec and lower layer bio splitting is supposed to handle the two
* correctly.
*/
- if (!lim->virt_boundary_mask) {
+ if (lim->virt_boundary_mask) {
+ if (!lim->max_segment_size)
+ lim->max_segment_size = UINT_MAX;
+ } else {
/*
* The maximum segment size has an odd historic 64k default that
* drivers probably should override. Just like the I/O size we

thanks
Ming


2024-04-24 13:09:41

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH] block: allow device to have both virt_boundary_mask and max segment size

Hi Ming,

On Wed, Apr 24, 2024 at 2:41 PM Ming Lei <[email protected]> wrote:
> On Wed, Apr 24, 2024 at 12:26:26PM +0200, Geert Uytterhoeven wrote:
> > On Sun, 7 Apr 2024, Ming Lei wrote:
> > > When one stacking device is over one device with virt_boundary_mask and
> > > another one with max segment size, the stacking device have both limits
> > > set. This way is allowed before d690cb8ae14b ("block: add an API to
> > > atomically update queue limits").
> > >
> > > Relax the limit so that we won't break such kind of stacking setting.
> > >
> > > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218687
> > > Reported-by: [email protected]
> > > Fixes: d690cb8ae14b ("block: add an API to atomically update queue limits")
> > > Link: https://lore.kernel.org/linux-block/ZfGl8HzUpiOxCLm3@fedora/
> > > Cc: Christoph Hellwig <[email protected]>
> > > Cc: Mike Snitzer <[email protected]>
> > > Cc: [email protected]
> > > Cc: Song Liu <[email protected]>
> > > Cc: [email protected]
> > > Signed-off-by: Ming Lei <[email protected]>
> >
> > Thanks for your patch, which is now commit b561ea56a26415bf ("block:
> > allow device to have both virt_boundary_mask and max segment size") in
> > v6.9-rc4.
> >
> > With CONFIG_DMA_API_DEBUG_SG=y and IOMMU support enabled, this causes a
> > warning on R-Car Gen3/Gen4 platforms:
> >
> > DMA-API: renesas_sdhi_internal_dmac ee160000.mmc: mapping sg segment longer than device claims to support [len=86016] [max=65536]
> > WARNING: CPU: 1 PID: 281 at kernel/dma/debug.c:1178 debug_dma_map_sg+0x2ac/0x330
> >
> > Reverting this commit, or disabling IOMMU support fixes the issue.
>
> Can you test the following patch?
>
>
> diff --git a/block/blk-settings.c b/block/blk-settings.c
> index 8e1d7ed52fef..ebba05a2bc7f 100644
> --- a/block/blk-settings.c
> +++ b/block/blk-settings.c
> @@ -188,7 +188,10 @@ static int blk_validate_limits(struct queue_limits *lim)
> * bvec and lower layer bio splitting is supposed to handle the two
> * correctly.
> */
> - if (!lim->virt_boundary_mask) {
> + if (lim->virt_boundary_mask) {
> + if (!lim->max_segment_size)
> + lim->max_segment_size = UINT_MAX;
> + } else {
> /*
> * The maximum segment size has an odd historic 64k default that
> * drivers probably should override. Just like the I/O size we

Thanks, that works for me (both with/without IOMMU support)!
Tested-by: Geert Uytterhoeven <[email protected]>

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68korg

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds