2020-04-19 02:10:34

by Qiushi Wu

[permalink] [raw]
Subject: [PATCH] btrfs: fix a potential racy

From: Qiushi Wu <[email protected]>

In function reada_find_extent and reada_extent_put, kref_get(&zone->refcnt)
are not called in a lock context. Potential racy may happen. It's possible
that thread1 decreases the kref to 0, and thread2 increases the kref to 1,
then thread1 releases the pointer.

Signed-off-by: Qiushi Wu <[email protected]>
---
fs/btrfs/reada.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/reada.c b/fs/btrfs/reada.c
index 243a2e44526e..4b90d04f7a0a 100644
--- a/fs/btrfs/reada.c
+++ b/fs/btrfs/reada.c
@@ -361,13 +361,15 @@ static struct reada_extent *reada_find_extent(struct btrfs_fs_info *fs_info,
zone = reada_find_zone(dev, logical, bbio);
if (!zone)
continue;
-
+ spin_lock(&fs_info->reada_lock);
re->zones[re->nzones++] = zone;
spin_lock(&zone->lock);
if (!zone->elems)
kref_get(&zone->refcnt);
+ spin_unlock(&fs_info->reada_lock);
++zone->elems;
spin_unlock(&zone->lock);
+
spin_lock(&fs_info->reada_lock);
kref_put(&zone->refcnt, reada_zone_release);
spin_unlock(&fs_info->reada_lock);
@@ -458,8 +460,11 @@ static struct reada_extent *reada_find_extent(struct btrfs_fs_info *fs_info,
for (nzones = 0; nzones < re->nzones; ++nzones) {
struct reada_zone *zone;

+ spin_lock(&fs_info->reada_lock);
zone = re->zones[nzones];
kref_get(&zone->refcnt);
+ spin_unlock(&fs_info->reada_lock);
+
spin_lock(&zone->lock);
--zone->elems;
if (zone->elems == 0) {
@@ -502,9 +507,11 @@ static void reada_extent_put(struct btrfs_fs_info *fs_info,
spin_unlock(&fs_info->reada_lock);

for (i = 0; i < re->nzones; ++i) {
+ spin_lock(&fs_info->reada_lock);
struct reada_zone *zone = re->zones[i];
-
kref_get(&zone->refcnt);
+ spin_unlock(&fs_info->reada_lock);
+
spin_lock(&zone->lock);
--zone->elems;
if (zone->elems == 0) {
--
2.17.1


2020-04-22 18:09:07

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH] btrfs: fix a potential racy

Hi,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on btrfs/next]
[also build test WARNING on v5.7-rc2 next-20200421]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url: https://github.com/0day-ci/linux/commits/wu000273-umn-edu/btrfs-fix-a-potential-racy/20200421-072124
base: https://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git next
config: arm64-allmodconfig (attached as .config)
compiler: aarch64-linux-gcc (GCC) 9.3.0
reproduce:
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day GCC_VERSION=9.3.0 make.cross ARCH=arm64

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <[email protected]>

All warnings (new ones prefixed by >>):

fs/btrfs/reada.c: In function 'reada_extent_put':
>> fs/btrfs/reada.c:509:3: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
509 | struct reada_zone *zone = re->zones[i];
| ^~~~~~

vim +509 fs/btrfs/reada.c

7414a03fbf9e75 Arne Jansen 2011-05-23 485
7414a03fbf9e75 Arne Jansen 2011-05-23 486 static void reada_extent_put(struct btrfs_fs_info *fs_info,
7414a03fbf9e75 Arne Jansen 2011-05-23 487 struct reada_extent *re)
7414a03fbf9e75 Arne Jansen 2011-05-23 488 {
7414a03fbf9e75 Arne Jansen 2011-05-23 489 int i;
09cbfeaf1a5a67 Kirill A. Shutemov 2016-04-01 490 unsigned long index = re->logical >> PAGE_SHIFT;
7414a03fbf9e75 Arne Jansen 2011-05-23 491
7414a03fbf9e75 Arne Jansen 2011-05-23 492 spin_lock(&fs_info->reada_lock);
99621b44aa194e Al Viro 2012-08-29 493 if (--re->refcnt) {
7414a03fbf9e75 Arne Jansen 2011-05-23 494 spin_unlock(&fs_info->reada_lock);
7414a03fbf9e75 Arne Jansen 2011-05-23 495 return;
7414a03fbf9e75 Arne Jansen 2011-05-23 496 }
7414a03fbf9e75 Arne Jansen 2011-05-23 497
7414a03fbf9e75 Arne Jansen 2011-05-23 498 radix_tree_delete(&fs_info->reada_tree, index);
7414a03fbf9e75 Arne Jansen 2011-05-23 499 for (i = 0; i < re->nzones; ++i) {
7414a03fbf9e75 Arne Jansen 2011-05-23 500 struct reada_zone *zone = re->zones[i];
7414a03fbf9e75 Arne Jansen 2011-05-23 501
7414a03fbf9e75 Arne Jansen 2011-05-23 502 radix_tree_delete(&zone->device->reada_extents, index);
7414a03fbf9e75 Arne Jansen 2011-05-23 503 }
7414a03fbf9e75 Arne Jansen 2011-05-23 504
7414a03fbf9e75 Arne Jansen 2011-05-23 505 spin_unlock(&fs_info->reada_lock);
7414a03fbf9e75 Arne Jansen 2011-05-23 506
7414a03fbf9e75 Arne Jansen 2011-05-23 507 for (i = 0; i < re->nzones; ++i) {
1255ee642bf9b0 Qiushi Wu 2020-04-18 508 spin_lock(&fs_info->reada_lock);
7414a03fbf9e75 Arne Jansen 2011-05-23 @509 struct reada_zone *zone = re->zones[i];
7414a03fbf9e75 Arne Jansen 2011-05-23 510 kref_get(&zone->refcnt);
1255ee642bf9b0 Qiushi Wu 2020-04-18 511 spin_unlock(&fs_info->reada_lock);
1255ee642bf9b0 Qiushi Wu 2020-04-18 512
7414a03fbf9e75 Arne Jansen 2011-05-23 513 spin_lock(&zone->lock);
7414a03fbf9e75 Arne Jansen 2011-05-23 514 --zone->elems;
7414a03fbf9e75 Arne Jansen 2011-05-23 515 if (zone->elems == 0) {
7414a03fbf9e75 Arne Jansen 2011-05-23 516 /* no fs_info->reada_lock needed, as this can't be
7414a03fbf9e75 Arne Jansen 2011-05-23 517 * the last ref */
7414a03fbf9e75 Arne Jansen 2011-05-23 518 kref_put(&zone->refcnt, reada_zone_release);
7414a03fbf9e75 Arne Jansen 2011-05-23 519 }
7414a03fbf9e75 Arne Jansen 2011-05-23 520 spin_unlock(&zone->lock);
7414a03fbf9e75 Arne Jansen 2011-05-23 521
7414a03fbf9e75 Arne Jansen 2011-05-23 522 spin_lock(&fs_info->reada_lock);
7414a03fbf9e75 Arne Jansen 2011-05-23 523 kref_put(&zone->refcnt, reada_zone_release);
7414a03fbf9e75 Arne Jansen 2011-05-23 524 spin_unlock(&fs_info->reada_lock);
7414a03fbf9e75 Arne Jansen 2011-05-23 525 }
7414a03fbf9e75 Arne Jansen 2011-05-23 526
7414a03fbf9e75 Arne Jansen 2011-05-23 527 kfree(re);
7414a03fbf9e75 Arne Jansen 2011-05-23 528 }
7414a03fbf9e75 Arne Jansen 2011-05-23 529

:::::: The code at line 509 was first introduced by commit
:::::: 7414a03fbf9e75fbbf2a3c16828cd862e572aa44 btrfs: initial readahead code and prototypes

:::::: TO: Arne Jansen <[email protected]>
:::::: CC: Arne Jansen <[email protected]>

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (5.02 kB)
.config.gz (56.79 kB)
Download all attachments

2020-04-22 21:47:44

by David Sterba

[permalink] [raw]
Subject: Re: [PATCH] btrfs: fix a potential racy

On Sat, Apr 18, 2020 at 08:59:07PM -0500, [email protected] wrote:
> From: Qiushi Wu <[email protected]>
>
> In function reada_find_extent and reada_extent_put, kref_get(&zone->refcnt)
> are not called in a lock context. Potential racy may happen. It's possible
> that thread1 decreases the kref to 0, and thread2 increases the kref to 1,
> then thread1 releases the pointer.

There are several things I don't see or understand why they woudl be a
problem.

kref_get does not need to be take under locks in case it's not the first
reference or if it's clear that eg. the caller has taken a reference and
it'll never go to 0.

The references are supposed to be lightweight so taking the locks per
kref_get does not follow a good pattern.

The kref type is a wrapper around refcount_t, that detects if there's an
increment from 0->1, similar to what you suggest that could happen. It
might be tricky to actually hit that case so I'm not saying that it's
all ok, just that we haven't seen that so far.

Your description of the race needs to be expanded as it's too terse,
where exactly the increments could happen, how would be the calls in the
two cpus interleaved, like

cpu1 cpu2

reada_find_extent()
kref_get
...
reada_find_extent()
kref_put (last put, refs == 0)
kref_get (inc from zero)
...

Then it's easer to reason about the actual races and the context where
it could happen. Eg. btrfs_reada_add adds one reference from the
beginning, so the final put does not happen in the middle of
reada_find_extent unless something would be seriously broken and that
would manifest very clearly.