btrfs_remove_block_group() invokes btrfs_lookup_block_group(), which
returns a local reference of the blcok group that contains the given
bytenr to "block_group" with increased refcount.
When btrfs_remove_block_group() returns, "block_group" becomes invalid,
so the refcount should be decreased to keep refcount balanced.
The reference counting issue happens in several exception handling paths
of btrfs_remove_block_group(). When those error scenarios occur such as
btrfs_alloc_path() returns NULL, the function forgets to decrease its
refcnt increased by btrfs_lookup_block_group() and will cause a refcnt
leak.
Fix this issue by jumping to "out_put_group" label and calling
btrfs_put_block_group() when those error scenarios occur.
Signed-off-by: Xiyu Yang <[email protected]>
Signed-off-by: Xin Tan <[email protected]>
---
fs/btrfs/block-group.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 404e050ce8ee..d9f432bd3329 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -916,7 +916,7 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans,
path = btrfs_alloc_path();
if (!path) {
ret = -ENOMEM;
- goto out;
+ goto out_put_group;
}
/*
@@ -954,7 +954,7 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans,
ret = btrfs_orphan_add(trans, BTRFS_I(inode));
if (ret) {
btrfs_add_delayed_iput(inode);
- goto out;
+ goto out_put_group;
}
clear_nlink(inode);
/* One for the block groups ref */
@@ -977,13 +977,13 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans,
ret = btrfs_search_slot(trans, tree_root, &key, path, -1, 1);
if (ret < 0)
- goto out;
+ goto out_put_group;
if (ret > 0)
btrfs_release_path(path);
if (ret == 0) {
ret = btrfs_del_item(trans, tree_root, path);
if (ret)
- goto out;
+ goto out_put_group;
btrfs_release_path(path);
}
@@ -1102,7 +1102,7 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans,
ret = remove_block_group_free_space(trans, block_group);
if (ret)
- goto out;
+ goto out_put_group;
btrfs_put_block_group(block_group);
btrfs_put_block_group(block_group);
@@ -1132,6 +1132,9 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans,
btrfs_delayed_refs_rsv_release(fs_info, 1);
btrfs_free_path(path);
return ret;
+out_put_group:
+ btrfs_put_block_group(block_group);
+ goto out;
}
struct btrfs_trans_handle *btrfs_start_trans_remove_block_group(
--
2.7.4
On Mon, Apr 20, 2020 at 6:48 AM Xiyu Yang <[email protected]> wrote:
>
> btrfs_remove_block_group() invokes btrfs_lookup_block_group(), which
> returns a local reference of the blcok group that contains the given
> bytenr to "block_group" with increased refcount.
>
> When btrfs_remove_block_group() returns, "block_group" becomes invalid,
> so the refcount should be decreased to keep refcount balanced.
>
> The reference counting issue happens in several exception handling paths
> of btrfs_remove_block_group(). When those error scenarios occur such as
> btrfs_alloc_path() returns NULL, the function forgets to decrease its
> refcnt increased by btrfs_lookup_block_group() and will cause a refcnt
> leak.
>
> Fix this issue by jumping to "out_put_group" label and calling
> btrfs_put_block_group() when those error scenarios occur.
>
> Signed-off-by: Xiyu Yang <[email protected]>
> Signed-off-by: Xin Tan <[email protected]>
Seems correct to me.
I would change the subject to something more clear like: "btrfs: fix
block group leak after failure to remove it"
One more suggestion below.
> ---
> fs/btrfs/block-group.c | 13 ++++++++-----
> 1 file changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
> index 404e050ce8ee..d9f432bd3329 100644
> --- a/fs/btrfs/block-group.c
> +++ b/fs/btrfs/block-group.c
> @@ -916,7 +916,7 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans,
> path = btrfs_alloc_path();
> if (!path) {
> ret = -ENOMEM;
> - goto out;
> + goto out_put_group;
> }
>
> /*
> @@ -954,7 +954,7 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans,
> ret = btrfs_orphan_add(trans, BTRFS_I(inode));
> if (ret) {
> btrfs_add_delayed_iput(inode);
> - goto out;
> + goto out_put_group;
> }
> clear_nlink(inode);
> /* One for the block groups ref */
> @@ -977,13 +977,13 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans,
>
> ret = btrfs_search_slot(trans, tree_root, &key, path, -1, 1);
> if (ret < 0)
> - goto out;
> + goto out_put_group;
> if (ret > 0)
> btrfs_release_path(path);
> if (ret == 0) {
> ret = btrfs_del_item(trans, tree_root, path);
> if (ret)
> - goto out;
> + goto out_put_group;
> btrfs_release_path(path);
> }
>
> @@ -1102,7 +1102,7 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans,
>
> ret = remove_block_group_free_space(trans, block_group);
> if (ret)
> - goto out;
> + goto out_put_group;
>
> btrfs_put_block_group(block_group);
> btrfs_put_block_group(block_group);
> @@ -1132,6 +1132,9 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans,
> btrfs_delayed_refs_rsv_release(fs_info, 1);
> btrfs_free_path(path);
> return ret;
> +out_put_group:
> + btrfs_put_block_group(block_group);
> + goto out;
Instead of this double goto, which tends to be error prone and harder to follow,
I suggest placing a call to btrfs_put_block_group() right after the
'out' label, with a comment
above it saying something like "once for the lookup reference" and
removing one of the btrfs_put_block_group()
calls right after calling remove_block_group_free_space(), and leaving
a comment above the other one saying "once for the block groups
rbtree".
Thanks.
> }
>
> struct btrfs_trans_handle *btrfs_start_trans_remove_block_group(
> --
> 2.7.4
>
--
Filipe David Manana,
“Whether you think you can, or you think you can't — you're right.”
On Mon, Apr 20, 2020 at 01:38:40PM +0800, Xiyu Yang wrote:
> btrfs_remove_block_group() invokes btrfs_lookup_block_group(), which
> returns a local reference of the blcok group that contains the given
> bytenr to "block_group" with increased refcount.
>
> When btrfs_remove_block_group() returns, "block_group" becomes invalid,
> so the refcount should be decreased to keep refcount balanced.
>
> The reference counting issue happens in several exception handling paths
> of btrfs_remove_block_group(). When those error scenarios occur such as
> btrfs_alloc_path() returns NULL, the function forgets to decrease its
> refcnt increased by btrfs_lookup_block_group() and will cause a refcnt
> leak.
>
> Fix this issue by jumping to "out_put_group" label and calling
> btrfs_put_block_group() when those error scenarios occur.
>
> Signed-off-by: Xiyu Yang <[email protected]>
> Signed-off-by: Xin Tan <[email protected]>
Thanks for the fix. May I ask if this was found by code inspection or by
some analysis tool?
> @@ -1132,6 +1132,9 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans,
> btrfs_delayed_refs_rsv_release(fs_info, 1);
> btrfs_free_path(path);
> return ret;
> +out_put_group:
> + btrfs_put_block_group(block_group);
> + goto out;
As Filipe noted, the trailing gotos are not a great pattern, what we'd
like to do is a more linear sequence where the resource
allocation/freeing nesting is obvious, like:
x = allocate(X);
if (!x)
goto out;
...
y = allocate(Y);
if (!y)
goto out_free_x;
z = allocate(Z);
if (!z)
goto out_free_y;
...
free(z);
out_free_y:
free(y);
out_free_x:
free(x);
out:
return;
(where allocate/free can be refcount inc/dec and similar). Sometimes
it's not that straightforward and the freeing block needs conditionals,
but from code reading perspective this is still better than potentially
wild gotos.
On Tue, Apr 21, 2020 at 12:43:15AM +0200, David Sterba wrote:
> On Mon, Apr 20, 2020 at 01:38:40PM +0800, Xiyu Yang wrote:
> > btrfs_remove_block_group() invokes btrfs_lookup_block_group(), which
> > returns a local reference of the blcok group that contains the given
> > bytenr to "block_group" with increased refcount.
> >
> > When btrfs_remove_block_group() returns, "block_group" becomes invalid,
> > so the refcount should be decreased to keep refcount balanced.
> >
> > The reference counting issue happens in several exception handling paths
> > of btrfs_remove_block_group(). When those error scenarios occur such as
> > btrfs_alloc_path() returns NULL, the function forgets to decrease its
> > refcnt increased by btrfs_lookup_block_group() and will cause a refcnt
> > leak.
> >
> > Fix this issue by jumping to "out_put_group" label and calling
> > btrfs_put_block_group() when those error scenarios occur.
> >
> > Signed-off-by: Xiyu Yang <[email protected]>
> > Signed-off-by: Xin Tan <[email protected]>
>
> Thanks for the fix. May I ask if this was found by code inspection or by
> some analysis tool?
Thanks for your advice about the patch! We are looking for some automated ways
to find this kind of bug.