Parity page is incorrectly unmapped in finish_parity_scrub(), triggering
a reference counter bug on i386, i.e.:
[ 157.662401] kernel BUG at mm/highmem.c:349!
[ 157.666725] invalid opcode: 0000 [#1] SMP PTI
Steps to reproduce the bug:
- create a raid5 btrfs filesystem:
# mkfs.btrfs -m raid5 -d raid5 /dev/sdb /dev/sdc /dev/sdd /dev/sde
- mount it:
# mount /dev/sdb /mnt
- run btrfs scrub in a loop:
# while :; do btrfs scrub start -BR /mnt; done
BugLink: https://bugs.launchpad.net/bugs/1812845
Signed-off-by: Andrea Righi <[email protected]>
---
fs/btrfs/raid56.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
index 1869ba8e5981..67a6f7d47402 100644
--- a/fs/btrfs/raid56.c
+++ b/fs/btrfs/raid56.c
@@ -2430,8 +2430,9 @@ static noinline void finish_parity_scrub(struct btrfs_raid_bio *rbio,
bitmap_clear(rbio->dbitmap, pagenr, 1);
kunmap(p);
- for (stripe = 0; stripe < rbio->real_stripes; stripe++)
+ for (stripe = 0; stripe < nr_data; stripe++)
kunmap(page_in_rbio(rbio, stripe, pagenr, 0));
+ kunmap(p_page);
}
__free_page(p_page);
--
2.19.1
On 13/03/2019 11:17, Andrea Righi wrote:
> Parity page is incorrectly unmapped in finish_parity_scrub(), triggering
> a reference counter bug on i386, i.e.:
>
> [ 157.662401] kernel BUG at mm/highmem.c:349!
> [ 157.666725] invalid opcode: 0000 [#1] SMP PTI
>
> Steps to reproduce the bug:
> - create a raid5 btrfs filesystem:
> # mkfs.btrfs -m raid5 -d raid5 /dev/sdb /dev/sdc /dev/sdd /dev/sde
>
> - mount it:
> # mount /dev/sdb /mnt
>
> - run btrfs scrub in a loop:
> # while :; do btrfs scrub start -BR /mnt; done
>
> BugLink: https://bugs.launchpad.net/bugs/1812845
> Signed-off-by: Andrea Righi <[email protected]>
> ---
> fs/btrfs/raid56.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
> index 1869ba8e5981..67a6f7d47402 100644
> --- a/fs/btrfs/raid56.c
> +++ b/fs/btrfs/raid56.c
> @@ -2430,8 +2430,9 @@ static noinline void finish_parity_scrub(struct btrfs_raid_bio *rbio,
> bitmap_clear(rbio->dbitmap, pagenr, 1);
> kunmap(p);
>
> - for (stripe = 0; stripe < rbio->real_stripes; stripe++)
> + for (stripe = 0; stripe < nr_data; stripe++)
> kunmap(page_in_rbio(rbio, stripe, pagenr, 0));
> + kunmap(p_page);
> }
>
> __free_page(p_page);
>
Code wise this looks ok, but the changelog could really describe what
you're changing and why it is correct.
I.e. the kunmap(p_page) was completely left out, so we never did an
unmap for the p_page and the loop unmapping the rbio page was iterating
over the wrong number of stripes, the map is done with nr_data vs
rbio->real_stripes for the unmap.
With the above (roughly) placed in the changelog:
Reviewed-by: Johannes Thumshirn <[email protected]>
--
Johannes Thumshirn SUSE Labs Filesystems
[email protected] +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Parity page is incorrectly unmapped in finish_parity_scrub(), triggering
a reference counter bug on i386, i.e.:
[ 157.662401] kernel BUG at mm/highmem.c:349!
[ 157.666725] invalid opcode: 0000 [#1] SMP PTI
The reason is that kunmap(p_page) was completely left out, so we never
did an unmap for the p_page and the loop unmapping the rbio page was
iterating over the wrong number of stripes: unmapping should be done
with nr_data instead of rbio->real_stripes.
Test case to reproduce the bug:
- create a raid5 btrfs filesystem:
# mkfs.btrfs -m raid5 -d raid5 /dev/sdb /dev/sdc /dev/sdd /dev/sde
- mount it:
# mount /dev/sdb /mnt
- run btrfs scrub in a loop:
# while :; do btrfs scrub start -BR /mnt; done
BugLink: https://bugs.launchpad.net/bugs/1812845
Reviewed-by: Johannes Thumshirn <[email protected]>
Signed-off-by: Andrea Righi <[email protected]>
---
Changes in v2:
- added a better description about this fix (thanks to Johannes)
fs/btrfs/raid56.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
index 1869ba8e5981..67a6f7d47402 100644
--- a/fs/btrfs/raid56.c
+++ b/fs/btrfs/raid56.c
@@ -2430,8 +2430,9 @@ static noinline void finish_parity_scrub(struct btrfs_raid_bio *rbio,
bitmap_clear(rbio->dbitmap, pagenr, 1);
kunmap(p);
- for (stripe = 0; stripe < rbio->real_stripes; stripe++)
+ for (stripe = 0; stripe < nr_data; stripe++)
kunmap(page_in_rbio(rbio, stripe, pagenr, 0));
+ kunmap(p_page);
}
__free_page(p_page);
--
2.19.1
On Thu, Mar 14, 2019 at 08:56:28AM +0100, Andrea Righi wrote:
> Parity page is incorrectly unmapped in finish_parity_scrub(), triggering
> a reference counter bug on i386, i.e.:
>
> [ 157.662401] kernel BUG at mm/highmem.c:349!
> [ 157.666725] invalid opcode: 0000 [#1] SMP PTI
>
> The reason is that kunmap(p_page) was completely left out, so we never
> did an unmap for the p_page and the loop unmapping the rbio page was
> iterating over the wrong number of stripes: unmapping should be done
> with nr_data instead of rbio->real_stripes.
>
> Test case to reproduce the bug:
>
> - create a raid5 btrfs filesystem:
> # mkfs.btrfs -m raid5 -d raid5 /dev/sdb /dev/sdc /dev/sdd /dev/sde
>
> - mount it:
> # mount /dev/sdb /mnt
>
> - run btrfs scrub in a loop:
> # while :; do btrfs scrub start -BR /mnt; done
>
> BugLink: https://bugs.launchpad.net/bugs/1812845
> Reviewed-by: Johannes Thumshirn <[email protected]>
> Signed-off-by: Andrea Righi <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Thanks.