2019-11-26 16:20:17

by SeongJae Park

[permalink] [raw]
Subject: [PATCH] xen/blkback: Avoid unmapping unmapped grant pages

From: SeongJae Park <[email protected]>

For each I/O request, blkback first maps the foreign pages for the
request to its local pages. If an allocation of a local page for the
mapping fails, it should unmap every mapping already made for the
request.

However, blkback's handling mechanism for the allocation failure does
not mark the remaining foreign pages as unmapped. Therefore, the unmap
function merely tries to unmap every valid grant page for the request,
including the pages not mapped due to the allocation failure. On a
system that fails the allocation frequently, this problem leads to
following kernel crash.

[ 372.012538] BUG: unable to handle kernel NULL pointer dereference at 0000000000000001
[ 372.012546] IP: [<ffffffff814071ac>] gnttab_unmap_refs.part.7+0x1c/0x40
[ 372.012557] PGD 16f3e9067 PUD 16426e067 PMD 0
[ 372.012562] Oops: 0002 [#1] SMP
[ 372.012566] Modules linked in: act_police sch_ingress cls_u32
...
[ 372.012746] Call Trace:
[ 372.012752] [<ffffffff81407204>] gnttab_unmap_refs+0x34/0x40
[ 372.012759] [<ffffffffa0335ae3>] xen_blkbk_unmap+0x83/0x150 [xen_blkback]
...
[ 372.012802] [<ffffffffa0336c50>] dispatch_rw_block_io+0x970/0x980 [xen_blkback]
...
Decompressing Linux... Parsing ELF... done.
Booting the kernel.
[ 0.000000] Initializing cgroup subsys cpuset

This commit fixes this problem by marking the grant pages of the given
request that didn't mapped due to the allocation failure as invalid.

Fixes: c6cc142dac52 ("xen-blkback: use balloon pages for all mappings")

Signed-off-by: SeongJae Park <[email protected]>
Reviewed-by: David Woodhouse <[email protected]>
Reviewed-by: Maximilian Heyne <[email protected]>
Reviewed-by: Paul Durrant <[email protected]>
---
drivers/block/xen-blkback/blkback.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c
index fd1e19f1a49f..3666afa639d1 100644
--- a/drivers/block/xen-blkback/blkback.c
+++ b/drivers/block/xen-blkback/blkback.c
@@ -936,6 +936,8 @@ static int xen_blkbk_map(struct xen_blkif_ring *ring,
out_of_memory:
pr_alert("%s: out of memory\n", __func__);
put_free_pages(ring, pages_to_gnt, segs_to_map);
+ for (i = last_map; i < num; i++)
+ pages[i]->handle = BLKBACK_INVALID_HANDLE;
return -ENOMEM;
}

--
2.17.1


2019-11-27 09:24:38

by Roger Pau Monne

[permalink] [raw]
Subject: Re: [PATCH] xen/blkback: Avoid unmapping unmapped grant pages

On Tue, Nov 26, 2019 at 04:36:05PM +0100, SeongJae Park wrote:
> From: SeongJae Park <[email protected]>
>
> For each I/O request, blkback first maps the foreign pages for the
> request to its local pages. If an allocation of a local page for the
> mapping fails, it should unmap every mapping already made for the
> request.
>
> However, blkback's handling mechanism for the allocation failure does
> not mark the remaining foreign pages as unmapped. Therefore, the unmap
> function merely tries to unmap every valid grant page for the request,
> including the pages not mapped due to the allocation failure. On a
> system that fails the allocation frequently, this problem leads to
> following kernel crash.
>
> [ 372.012538] BUG: unable to handle kernel NULL pointer dereference at 0000000000000001
> [ 372.012546] IP: [<ffffffff814071ac>] gnttab_unmap_refs.part.7+0x1c/0x40
> [ 372.012557] PGD 16f3e9067 PUD 16426e067 PMD 0
> [ 372.012562] Oops: 0002 [#1] SMP
> [ 372.012566] Modules linked in: act_police sch_ingress cls_u32
> ...
> [ 372.012746] Call Trace:
> [ 372.012752] [<ffffffff81407204>] gnttab_unmap_refs+0x34/0x40
> [ 372.012759] [<ffffffffa0335ae3>] xen_blkbk_unmap+0x83/0x150 [xen_blkback]
> ...
> [ 372.012802] [<ffffffffa0336c50>] dispatch_rw_block_io+0x970/0x980 [xen_blkback]
> ...
> Decompressing Linux... Parsing ELF... done.
> Booting the kernel.
> [ 0.000000] Initializing cgroup subsys cpuset
>
> This commit fixes this problem by marking the grant pages of the given
> request that didn't mapped due to the allocation failure as invalid.
>
> Fixes: c6cc142dac52 ("xen-blkback: use balloon pages for all mappings")
>
> Signed-off-by: SeongJae Park <[email protected]>
> Reviewed-by: David Woodhouse <[email protected]>
> Reviewed-by: Maximilian Heyne <[email protected]>
> Reviewed-by: Paul Durrant <[email protected]>

Reviewed-by: Roger Pau Monn? <[email protected]>

Thanks, Roger.

2019-12-03 14:34:49

by SeongJae Park

[permalink] [raw]
Subject: Re: [PATCH] xen/blkback: Avoid unmapping unmapped grant pages

On 27.11.19 10:13, Roger Pau Monné wrote:
> On Tue, Nov 26, 2019 at 04:36:05PM +0100, SeongJae Park wrote:
>> From: SeongJae Park <[email protected]>
>>
>> For each I/O request, blkback first maps the foreign pages for the
>> request to its local pages. If an allocation of a local page for the
>> mapping fails, it should unmap every mapping already made for the
>> request.
>>
>> However, blkback's handling mechanism for the allocation failure does
>> not mark the remaining foreign pages as unmapped. Therefore, the unmap
>> function merely tries to unmap every valid grant page for the request,
>> including the pages not mapped due to the allocation failure. On a
>> system that fails the allocation frequently, this problem leads to
>> following kernel crash.
>>
>> [ 372.012538] BUG: unable to handle kernel NULL pointer dereference at 0000000000000001
>> [ 372.012546] IP: [<ffffffff814071ac>] gnttab_unmap_refs.part.7+0x1c/0x40
>> [ 372.012557] PGD 16f3e9067 PUD 16426e067 PMD 0
>> [ 372.012562] Oops: 0002 [#1] SMP
>> [ 372.012566] Modules linked in: act_police sch_ingress cls_u32
>> ...
>> [ 372.012746] Call Trace:
>> [ 372.012752] [<ffffffff81407204>] gnttab_unmap_refs+0x34/0x40
>> [ 372.012759] [<ffffffffa0335ae3>] xen_blkbk_unmap+0x83/0x150 [xen_blkback]
>> ...
>> [ 372.012802] [<ffffffffa0336c50>] dispatch_rw_block_io+0x970/0x980 [xen_blkback]
>> ...
>> Decompressing Linux... Parsing ELF... done.
>> Booting the kernel.
>> [ 0.000000] Initializing cgroup subsys cpuset
>>
>> This commit fixes this problem by marking the grant pages of the given
>> request that didn't mapped due to the allocation failure as invalid.
>>
>> Fixes: c6cc142dac52 ("xen-blkback: use balloon pages for all mappings")
>>
>> Signed-off-by: SeongJae Park <[email protected]>
>> Reviewed-by: David Woodhouse <[email protected]>
>> Reviewed-by: Maximilian Heyne <[email protected]>
>> Reviewed-by: Paul Durrant <[email protected]>
> Reviewed-by: Roger Pau MonnĂ© <[email protected]>
>
> Thanks, Roger.


May I ask some more comments?



Thanks,

SeongJae Park


2019-12-03 21:04:44

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] xen/blkback: Avoid unmapping unmapped grant pages

On 11/26/19 8:36 AM, SeongJae Park wrote:
> From: SeongJae Park <[email protected]>
>
> For each I/O request, blkback first maps the foreign pages for the
> request to its local pages. If an allocation of a local page for the
> mapping fails, it should unmap every mapping already made for the
> request.
>
> However, blkback's handling mechanism for the allocation failure does
> not mark the remaining foreign pages as unmapped. Therefore, the unmap
> function merely tries to unmap every valid grant page for the request,
> including the pages not mapped due to the allocation failure. On a
> system that fails the allocation frequently, this problem leads to
> following kernel crash.
>
> [ 372.012538] BUG: unable to handle kernel NULL pointer dereference at 0000000000000001
> [ 372.012546] IP: [<ffffffff814071ac>] gnttab_unmap_refs.part.7+0x1c/0x40
> [ 372.012557] PGD 16f3e9067 PUD 16426e067 PMD 0
> [ 372.012562] Oops: 0002 [#1] SMP
> [ 372.012566] Modules linked in: act_police sch_ingress cls_u32
> ...
> [ 372.012746] Call Trace:
> [ 372.012752] [<ffffffff81407204>] gnttab_unmap_refs+0x34/0x40
> [ 372.012759] [<ffffffffa0335ae3>] xen_blkbk_unmap+0x83/0x150 [xen_blkback]
> ...
> [ 372.012802] [<ffffffffa0336c50>] dispatch_rw_block_io+0x970/0x980 [xen_blkback]
> ...
> Decompressing Linux... Parsing ELF... done.
> Booting the kernel.
> [ 0.000000] Initializing cgroup subsys cpuset
>
> This commit fixes this problem by marking the grant pages of the given
> request that didn't mapped due to the allocation failure as invalid.
>
> Fixes: c6cc142dac52 ("xen-blkback: use balloon pages for all mappings")

Queued up with Roger's reviewed-by.

--
Jens Axboe