2023-12-09 10:22:09

by kernel test robot

[permalink] [raw]
Subject: [paulmck-rcu:frederic.2023.12.08a 29/37] fs/btrfs/transaction.c:496:6: error: call to '__compiletime_assert_329' declared with 'error' attribute: Need native word sized stores/loads for atomicity.

tree: https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git frederic.2023.12.08a
head: 37843b5f561a08ae899fb791eeeb5abd992eabe2
commit: 7dd87072d40809e26503f04b79d63290288dbbac [29/37] btrfs: Adjust ->last_trans ordering in btrfs_record_root_in_trans()
config: riscv-rv32_defconfig (https://download.01.org/0day-ci/archive/20231209/[email protected]/config)
compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project.git 4a5ac14ee968ff0ad5d2cc1ffa0299048db4c88a)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231209/[email protected]/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <[email protected]>
| Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/

All errors (new ones prefixed by >>):

warning: unknown warning option '-Wpacked-not-aligned'; did you mean '-Wpacked-non-pod'? [-Wunknown-warning-option]
warning: unknown warning option '-Wstringop-truncation'; did you mean '-Wstring-concatenation'? [-Wunknown-warning-option]
warning: unknown warning option '-Wmaybe-uninitialized'; did you mean '-Wuninitialized'? [-Wunknown-warning-option]
>> fs/btrfs/transaction.c:496:6: error: call to '__compiletime_assert_329' declared with 'error' attribute: Need native word sized stores/loads for atomicity.
496 | if (smp_load_acquire(&root->last_trans) == trans->transid && /* ^^^ */
| ^
include/asm-generic/barrier.h:176:29: note: expanded from macro 'smp_load_acquire'
176 | #define smp_load_acquire(p) __smp_load_acquire(p)
| ^
arch/riscv/include/asm/barrier.h:42:2: note: expanded from macro '__smp_load_acquire'
42 | compiletime_assert_atomic_type(*p); \
| ^
include/linux/compiler_types.h:438:2: note: expanded from macro 'compiletime_assert_atomic_type'
438 | compiletime_assert(__native_word(t), \
| ^
note: (skipping 1 expansions in backtrace; use -fmacro-backtrace-limit=0 to see all)
include/linux/compiler_types.h:423:2: note: expanded from macro '_compiletime_assert'
423 | __compiletime_assert(condition, msg, prefix, suffix)
| ^
include/linux/compiler_types.h:416:4: note: expanded from macro '__compiletime_assert'
416 | prefix ## suffix(); \
| ^
<scratch space>:60:1: note: expanded from here
60 | __compiletime_assert_329
| ^
fs/btrfs/transaction.c:434:3: error: call to '__compiletime_assert_326' declared with 'error' attribute: Need native word sized stores/loads for atomicity.
434 | smp_store_release(&root->last_trans, trans->transid); /* ^^^ */
| ^
include/asm-generic/barrier.h:172:55: note: expanded from macro 'smp_store_release'
172 | #define smp_store_release(p, v) do { kcsan_release(); __smp_store_release(p, v); } while (0)
| ^
arch/riscv/include/asm/barrier.h:34:2: note: expanded from macro '__smp_store_release'
34 | compiletime_assert_atomic_type(*p); \
| ^
include/linux/compiler_types.h:438:2: note: expanded from macro 'compiletime_assert_atomic_type'
438 | compiletime_assert(__native_word(t), \
| ^
note: (skipping 1 expansions in backtrace; use -fmacro-backtrace-limit=0 to see all)
include/linux/compiler_types.h:423:2: note: expanded from macro '_compiletime_assert'
423 | __compiletime_assert(condition, msg, prefix, suffix)
| ^
include/linux/compiler_types.h:416:4: note: expanded from macro '__compiletime_assert'
416 | prefix ## suffix(); \
| ^
<scratch space>:46:1: note: expanded from here
46 | __compiletime_assert_326
| ^
>> fs/btrfs/transaction.c:496:6: error: call to '__compiletime_assert_329' declared with 'error' attribute: Need native word sized stores/loads for atomicity.
496 | if (smp_load_acquire(&root->last_trans) == trans->transid && /* ^^^ */
| ^
include/asm-generic/barrier.h:176:29: note: expanded from macro 'smp_load_acquire'
176 | #define smp_load_acquire(p) __smp_load_acquire(p)
| ^
arch/riscv/include/asm/barrier.h:42:2: note: expanded from macro '__smp_load_acquire'
42 | compiletime_assert_atomic_type(*p); \
| ^
include/linux/compiler_types.h:438:2: note: expanded from macro 'compiletime_assert_atomic_type'
438 | compiletime_assert(__native_word(t), \
| ^
note: (skipping 1 expansions in backtrace; use -fmacro-backtrace-limit=0 to see all)
include/linux/compiler_types.h:423:2: note: expanded from macro '_compiletime_assert'
423 | __compiletime_assert(condition, msg, prefix, suffix)
| ^
include/linux/compiler_types.h:416:4: note: expanded from macro '__compiletime_assert'
416 | prefix ## suffix(); \
| ^
<scratch space>:60:1: note: expanded from here
60 | __compiletime_assert_329
| ^
3 warnings and 3 errors generated.


vim +496 fs/btrfs/transaction.c

481
482 int btrfs_record_root_in_trans(struct btrfs_trans_handle *trans,
483 struct btrfs_root *root)
484 {
485 struct btrfs_fs_info *fs_info = root->fs_info;
486 int ret;
487
488 if (!test_bit(BTRFS_ROOT_SHAREABLE, &root->state))
489 return 0;
490
491 /*
492 * see record_root_in_trans for comments about IN_TRANS_SETUP usage
493 * and barriers. The smp_load_acquire() pairs with the smp_wmb()
494 * and the test_bit_acquire pairs with the smp_mb__before_atomic().
495 */
> 496 if (smp_load_acquire(&root->last_trans) == trans->transid && /* ^^^ */
497 !test_bit_acquire(BTRFS_ROOT_IN_TRANS_SETUP, &root->state)) /* ^^^ */
498 return 0;
499
500 mutex_lock(&fs_info->reloc_mutex);
501 ret = record_root_in_trans(trans, root, 0);
502 mutex_unlock(&fs_info->reloc_mutex);
503
504 return ret;
505 }
506

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


2023-12-09 15:51:47

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [paulmck-rcu:frederic.2023.12.08a 29/37] fs/btrfs/transaction.c:496:6: error: call to '__compiletime_assert_329' declared with 'error' attribute: Need native word sized stores/loads for atomicity.

On Sat, Dec 09, 2023 at 06:20:37PM +0800, kernel test robot wrote:
> tree: https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git frederic.2023.12.08a
> head: 37843b5f561a08ae899fb791eeeb5abd992eabe2
> commit: 7dd87072d40809e26503f04b79d63290288dbbac [29/37] btrfs: Adjust ->last_trans ordering in btrfs_record_root_in_trans()
> config: riscv-rv32_defconfig (https://download.01.org/0day-ci/archive/20231209/[email protected]/config)
> compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project.git 4a5ac14ee968ff0ad5d2cc1ffa0299048db4c88a)
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231209/[email protected]/reproduce)
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <[email protected]>
> | Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
>
> All errors (new ones prefixed by >>):
>
> warning: unknown warning option '-Wpacked-not-aligned'; did you mean '-Wpacked-non-pod'? [-Wunknown-warning-option]
> warning: unknown warning option '-Wstringop-truncation'; did you mean '-Wstring-concatenation'? [-Wunknown-warning-option]
> warning: unknown warning option '-Wmaybe-uninitialized'; did you mean '-Wuninitialized'? [-Wunknown-warning-option]
> >> fs/btrfs/transaction.c:496:6: error: call to '__compiletime_assert_329' declared with 'error' attribute: Need native word sized stores/loads for atomicity.
> 496 | if (smp_load_acquire(&root->last_trans) == trans->transid && /* ^^^ */
> | ^

Ooooh!!! :-/

From what I can see, the current code can tear this load on 32-bit
systems, which can result in bad comparisons and then in failure to wait
for a partially complete transaction.

So is btrfs actually supported on 32-bit systems? If not, would the
following patch be appropriate?

If btrfs is to be supported on 32-bit systems, from what I can see some
major surgery is required, even if a 32-bit counter is wrap-safe for
this particular type of transaction. (But SSDs? In-memory btrfs
filesystems?)

Thanx, Paul

------------------------------------------------------------------------

diff --git a/fs/btrfs/Kconfig b/fs/btrfs/Kconfig
index 4fb925e8c981..4d56158c34f9 100644
--- a/fs/btrfs/Kconfig
+++ b/fs/btrfs/Kconfig
@@ -19,6 +19,7 @@ config BTRFS_FS
select RAID6_PQ
select XOR_BLOCKS
depends on PAGE_SIZE_LESS_THAN_256KB
+ depends on 64BIT

help
Btrfs is a general purpose copy-on-write filesystem with extents,

2023-12-13 13:01:41

by David Sterba

[permalink] [raw]
Subject: Re: [paulmck-rcu:frederic.2023.12.08a 29/37] fs/btrfs/transaction.c:496:6: error: call to '__compiletime_assert_329' declared with 'error' attribute: Need native word sized stores/loads for atomicity.

On Sat, Dec 09, 2023 at 07:51:30AM -0800, Paul E. McKenney wrote:
> On Sat, Dec 09, 2023 at 06:20:37PM +0800, kernel test robot wrote:
> > tree: https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git frederic.2023.12.08a
> > head: 37843b5f561a08ae899fb791eeeb5abd992eabe2
> > commit: 7dd87072d40809e26503f04b79d63290288dbbac [29/37] btrfs: Adjust ->last_trans ordering in btrfs_record_root_in_trans()
> > config: riscv-rv32_defconfig (https://download.01.org/0day-ci/archive/20231209/[email protected]/config)
> > compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project.git 4a5ac14ee968ff0ad5d2cc1ffa0299048db4c88a)
> > reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231209/[email protected]/reproduce)
> >
> > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > the same patch/commit), kindly add following tags
> > | Reported-by: kernel test robot <[email protected]>
> > | Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
> >
> > All errors (new ones prefixed by >>):
> >
> > warning: unknown warning option '-Wpacked-not-aligned'; did you mean '-Wpacked-non-pod'? [-Wunknown-warning-option]
> > warning: unknown warning option '-Wstringop-truncation'; did you mean '-Wstring-concatenation'? [-Wunknown-warning-option]
> > warning: unknown warning option '-Wmaybe-uninitialized'; did you mean '-Wuninitialized'? [-Wunknown-warning-option]
> > >> fs/btrfs/transaction.c:496:6: error: call to '__compiletime_assert_329' declared with 'error' attribute: Need native word sized stores/loads for atomicity.
> > 496 | if (smp_load_acquire(&root->last_trans) == trans->transid && /* ^^^ */
> > | ^
>
> Ooooh!!! :-/
>
> From what I can see, the current code can tear this load on 32-bit
> systems, which can result in bad comparisons and then in failure to wait
> for a partially complete transaction.
>
> So is btrfs actually supported on 32-bit systems? If not, would the
> following patch be appropriate?

There are limitations on 32bit systems, eg. due to shorter inode numbers
(ino_t is unsigned long) and that radix-tree/xarray does support only
unsigned long keys, while we have 64bit identifiers for inodes or tree
roots.

So far we support that and dropping it completely is I think a big deal,
like with any possibly used feature. What I've seen there are NAS boxes
with low power ARM that are 32bit.

> If btrfs is to be supported on 32-bit systems, from what I can see some
> major surgery is required, even if a 32-bit counter is wrap-safe for
> this particular type of transaction. (But SSDs? In-memory btrfs
> filesystems?)

We won't probably do any major surgery to support 32bit systems.

>
> Thanx, Paul
>
> ------------------------------------------------------------------------
>
> diff --git a/fs/btrfs/Kconfig b/fs/btrfs/Kconfig
> index 4fb925e8c981..4d56158c34f9 100644
> --- a/fs/btrfs/Kconfig
> +++ b/fs/btrfs/Kconfig
> @@ -19,6 +19,7 @@ config BTRFS_FS
> select RAID6_PQ
> select XOR_BLOCKS
> depends on PAGE_SIZE_LESS_THAN_256KB
> + depends on 64BIT

Can we keep the current inefficient smp_* barriers instead of dropping
the whole 32bit support as an alternative. If the smp_load_acquire are
better but not strictly necessary for the correctness (from the barriers
POV) I'd suggest to leave it as-is. We can put comments in case somebody
wants to optimize it in the future again.

2023-12-13 14:56:59

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [paulmck-rcu:frederic.2023.12.08a 29/37] fs/btrfs/transaction.c:496:6: error: call to '__compiletime_assert_329' declared with 'error' attribute: Need native word sized stores/loads for atomicity.

On Wed, Dec 13, 2023 at 01:53:58PM +0100, David Sterba wrote:
> On Sat, Dec 09, 2023 at 07:51:30AM -0800, Paul E. McKenney wrote:
> > On Sat, Dec 09, 2023 at 06:20:37PM +0800, kernel test robot wrote:
> > > tree: https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git frederic.2023.12.08a
> > > head: 37843b5f561a08ae899fb791eeeb5abd992eabe2
> > > commit: 7dd87072d40809e26503f04b79d63290288dbbac [29/37] btrfs: Adjust ->last_trans ordering in btrfs_record_root_in_trans()
> > > config: riscv-rv32_defconfig (https://download.01.org/0day-ci/archive/20231209/[email protected]/config)
> > > compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project.git 4a5ac14ee968ff0ad5d2cc1ffa0299048db4c88a)
> > > reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231209/[email protected]/reproduce)
> > >
> > > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > > the same patch/commit), kindly add following tags
> > > | Reported-by: kernel test robot <[email protected]>
> > > | Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
> > >
> > > All errors (new ones prefixed by >>):
> > >
> > > warning: unknown warning option '-Wpacked-not-aligned'; did you mean '-Wpacked-non-pod'? [-Wunknown-warning-option]
> > > warning: unknown warning option '-Wstringop-truncation'; did you mean '-Wstring-concatenation'? [-Wunknown-warning-option]
> > > warning: unknown warning option '-Wmaybe-uninitialized'; did you mean '-Wuninitialized'? [-Wunknown-warning-option]
> > > >> fs/btrfs/transaction.c:496:6: error: call to '__compiletime_assert_329' declared with 'error' attribute: Need native word sized stores/loads for atomicity.
> > > 496 | if (smp_load_acquire(&root->last_trans) == trans->transid && /* ^^^ */
> > > | ^
> >
> > Ooooh!!! :-/
> >
> > From what I can see, the current code can tear this load on 32-bit
> > systems, which can result in bad comparisons and then in failure to wait
> > for a partially complete transaction.
> >
> > So is btrfs actually supported on 32-bit systems? If not, would the
> > following patch be appropriate?
>
> There are limitations on 32bit systems, eg. due to shorter inode numbers
> (ino_t is unsigned long) and that radix-tree/xarray does support only
> unsigned long keys, while we have 64bit identifiers for inodes or tree
> roots.
>
> So far we support that and dropping it completely is I think a big deal,
> like with any possibly used feature. What I've seen there are NAS boxes
> with low power ARM that are 32bit.
>
> > If btrfs is to be supported on 32-bit systems, from what I can see some
> > major surgery is required, even if a 32-bit counter is wrap-safe for
> > this particular type of transaction. (But SSDs? In-memory btrfs
> > filesystems?)
>
> We won't probably do any major surgery to support 32bit systems.

Got it, and thank you for the background! My takeaway is that 32-bit
BTRFS must work in the common case, but might have issues on some
workloads, for example, running out of inode numbers or load tearing.

> > ------------------------------------------------------------------------
> >
> > diff --git a/fs/btrfs/Kconfig b/fs/btrfs/Kconfig
> > index 4fb925e8c981..4d56158c34f9 100644
> > --- a/fs/btrfs/Kconfig
> > +++ b/fs/btrfs/Kconfig
> > @@ -19,6 +19,7 @@ config BTRFS_FS
> > select RAID6_PQ
> > select XOR_BLOCKS
> > depends on PAGE_SIZE_LESS_THAN_256KB
> > + depends on 64BIT
>
> Can we keep the current inefficient smp_* barriers instead of dropping
> the whole 32bit support as an alternative. If the smp_load_acquire are
> better but not strictly necessary for the correctness (from the barriers
> POV) I'd suggest to leave it as-is. We can put comments in case somebody
> wants to optimize it in the future again.

We still have the barrier placement issue, given that smp_rmb() enforces
only the ordering of earlier and later loads, correct? Or am I missing
some other ordering constraint that makes all that work?

But I can make each of the current patch's smp_load_acquire() call instead
be be a READ_ONCE() followed by an smp_rmb(), the test_bit_acquire()
call be test_bit() followed by smp_rmb(), and the smp_store_release()
call be an smp_wmb() followed by a WRITE_ONCE(). This provides the needed
ordering, bullet-proofs the 64-bit code against compilers, but works on
32-bit systems. For example, on a 32-bit system the 64-bit READ_ONCE()
and WRITE_ONCE() might still be compiled as a pair of 32-bit memory
accesses, but they will both be guaranteed to be single memory accesses
on 64-bit systems.

Would that work for you guys?

Thanx, Paul

2023-12-13 15:55:03

by Josef Bacik

[permalink] [raw]
Subject: Re: [paulmck-rcu:frederic.2023.12.08a 29/37] fs/btrfs/transaction.c:496:6: error: call to '__compiletime_assert_329' declared with 'error' attribute: Need native word sized stores/loads for atomicity.

On Wed, Dec 13, 2023 at 06:56:37AM -0800, Paul E. McKenney wrote:
> On Wed, Dec 13, 2023 at 01:53:58PM +0100, David Sterba wrote:
> > On Sat, Dec 09, 2023 at 07:51:30AM -0800, Paul E. McKenney wrote:
> > > On Sat, Dec 09, 2023 at 06:20:37PM +0800, kernel test robot wrote:
> > > > tree: https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git frederic.2023.12.08a
> > > > head: 37843b5f561a08ae899fb791eeeb5abd992eabe2
> > > > commit: 7dd87072d40809e26503f04b79d63290288dbbac [29/37] btrfs: Adjust ->last_trans ordering in btrfs_record_root_in_trans()
> > > > config: riscv-rv32_defconfig (https://download.01.org/0day-ci/archive/20231209/[email protected]/config)
> > > > compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project.git 4a5ac14ee968ff0ad5d2cc1ffa0299048db4c88a)
> > > > reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231209/[email protected]/reproduce)
> > > >
> > > > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > > > the same patch/commit), kindly add following tags
> > > > | Reported-by: kernel test robot <[email protected]>
> > > > | Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
> > > >
> > > > All errors (new ones prefixed by >>):
> > > >
> > > > warning: unknown warning option '-Wpacked-not-aligned'; did you mean '-Wpacked-non-pod'? [-Wunknown-warning-option]
> > > > warning: unknown warning option '-Wstringop-truncation'; did you mean '-Wstring-concatenation'? [-Wunknown-warning-option]
> > > > warning: unknown warning option '-Wmaybe-uninitialized'; did you mean '-Wuninitialized'? [-Wunknown-warning-option]
> > > > >> fs/btrfs/transaction.c:496:6: error: call to '__compiletime_assert_329' declared with 'error' attribute: Need native word sized stores/loads for atomicity.
> > > > 496 | if (smp_load_acquire(&root->last_trans) == trans->transid && /* ^^^ */
> > > > | ^
> > >
> > > Ooooh!!! :-/
> > >
> > > From what I can see, the current code can tear this load on 32-bit
> > > systems, which can result in bad comparisons and then in failure to wait
> > > for a partially complete transaction.
> > >
> > > So is btrfs actually supported on 32-bit systems? If not, would the
> > > following patch be appropriate?
> >
> > There are limitations on 32bit systems, eg. due to shorter inode numbers
> > (ino_t is unsigned long) and that radix-tree/xarray does support only
> > unsigned long keys, while we have 64bit identifiers for inodes or tree
> > roots.
> >
> > So far we support that and dropping it completely is I think a big deal,
> > like with any possibly used feature. What I've seen there are NAS boxes
> > with low power ARM that are 32bit.
> >
> > > If btrfs is to be supported on 32-bit systems, from what I can see some
> > > major surgery is required, even if a 32-bit counter is wrap-safe for
> > > this particular type of transaction. (But SSDs? In-memory btrfs
> > > filesystems?)
> >
> > We won't probably do any major surgery to support 32bit systems.
>
> Got it, and thank you for the background! My takeaway is that 32-bit
> BTRFS must work in the common case, but might have issues on some
> workloads, for example, running out of inode numbers or load tearing.
>
> > > ------------------------------------------------------------------------
> > >
> > > diff --git a/fs/btrfs/Kconfig b/fs/btrfs/Kconfig
> > > index 4fb925e8c981..4d56158c34f9 100644
> > > --- a/fs/btrfs/Kconfig
> > > +++ b/fs/btrfs/Kconfig
> > > @@ -19,6 +19,7 @@ config BTRFS_FS
> > > select RAID6_PQ
> > > select XOR_BLOCKS
> > > depends on PAGE_SIZE_LESS_THAN_256KB
> > > + depends on 64BIT
> >
> > Can we keep the current inefficient smp_* barriers instead of dropping
> > the whole 32bit support as an alternative. If the smp_load_acquire are
> > better but not strictly necessary for the correctness (from the barriers
> > POV) I'd suggest to leave it as-is. We can put comments in case somebody
> > wants to optimize it in the future again.
>
> We still have the barrier placement issue, given that smp_rmb() enforces
> only the ordering of earlier and later loads, correct? Or am I missing
> some other ordering constraint that makes all that work?
>
> But I can make each of the current patch's smp_load_acquire() call instead
> be be a READ_ONCE() followed by an smp_rmb(), the test_bit_acquire()
> call be test_bit() followed by smp_rmb(), and the smp_store_release()
> call be an smp_wmb() followed by a WRITE_ONCE(). This provides the needed
> ordering, bullet-proofs the 64-bit code against compilers, but works on
> 32-bit systems. For example, on a 32-bit system the 64-bit READ_ONCE()
> and WRITE_ONCE() might still be compiled as a pair of 32-bit memory
> accesses, but they will both be guaranteed to be single memory accesses
> on 64-bit systems.
>
> Would that work for you guys?

Actually I think we just need to re-work all of this to make it less silly.
Does this look reasonable to you Paul? I still have to test it, but I think it
addresses your concerns and lets us keep 32bit support ;). Thanks,

Josef

From: Josef Bacik <[email protected]>
Date: Wed, 13 Dec 2023 10:46:57 -0500
Subject: [PATCH] btrfs: cleanup how we record roots in the transaction

Previously we didn't have a bit field for tracking if a root was
recorded in the transaction, it was simply an integer on the root.
Additionally we have the ->last_trans which we record the last
transaction this root was modified in. We had various memory barriers
to attempt to synchronize these two updates so we could avoid taking the
->reloc_mutex all of the time for transaction starts.

However this code has changed a lot, but the dubious memory ordering has
remained. Fix up this code to make it more clear and get rid of the
memory barriers. All we're trying to avoid here is constantly taking
the ->reloc_mutex, so add a new bit to mark that the root has been
recorded. If this bit is set then we can skip the expensive step,
otherwise we have to take the lock and wait.

Add a big comment indicating why we're doing the ordering and locking
the way we are, and use ->fs_roots_radix_lock as the arbiter of this new
bit.

This makes the code simpler, removes the memory barriers, and documents
clearly what we're doing so it's less confusing for future readers.

Signed-off-by: Josef Bacik <[email protected]>
---
fs/btrfs/ctree.h | 11 ++++---
fs/btrfs/disk-io.c | 1 +
fs/btrfs/transaction.c | 69 +++++++++++++++++-------------------------
3 files changed, 34 insertions(+), 47 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 70e828d33177..eb44cf191863 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -90,12 +90,13 @@ struct btrfs_path {
*/
enum {
/*
- * btrfs_record_root_in_trans is a multi-step process, and it can race
- * with the balancing code. But the race is very small, and only the
- * first time the root is added to each transaction. So IN_TRANS_SETUP
- * is used to tell us when more checks are required
+ * We set this after we've recorded the root in the transaction. This
+ * is used to avoid taking the reloc_mutex every time we call
+ * btrfs_start_transaction(root). Once the root has been recorded we
+ * don't have to do the setup work again, and this is cleared on
+ * transaction commit once we're done with the root.
*/
- BTRFS_ROOT_IN_TRANS_SETUP,
+ BTRFS_ROOT_RECORDED,

/*
* Set if tree blocks of this root can be shared by other roots.
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index c6907d533fe8..0410fac5f78e 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -4805,6 +4805,7 @@ static void btrfs_free_all_qgroup_pertrans(struct btrfs_fs_info *fs_info)
radix_tree_tag_clear(&fs_info->fs_roots_radix,
(unsigned long)root->root_key.objectid,
BTRFS_ROOT_TRANS_TAG);
+ clear_bit(BTRFS_ROOT_RECORDED, &root->state);
}
}
spin_unlock(&fs_info->fs_roots_radix_lock);
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 5b3333ceef04..186089ddac27 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -407,54 +407,43 @@ static int record_root_in_trans(struct btrfs_trans_handle *trans,
int ret = 0;

if ((test_bit(BTRFS_ROOT_SHAREABLE, &root->state) &&
- root->last_trans < trans->transid) || force) {
+ !test_bit(BTRFS_ROOT_RECORDED, &root->state)) || force) {
WARN_ON(!force && root->commit_root != root->node);

/*
- * see below for IN_TRANS_SETUP usage rules
- * we have the reloc mutex held now, so there
- * is only one writer in this function
+ * We are using ->fs_roots_radix_lock as the ultimate protector
+ * of BTRFS_ROOT_RECORDED, which means we want to set it once
+ * we're completely done, so we call btrfs_init_reloc_root()
+ * first.
+ *
+ * This is ok because we are protected here by ->reloc_mutex, so
+ * in reality we won't ever race and call
+ * btrfs_init_reloc_root() twice on the same root back to back.
+ * However if we do it's ok, because again we're protected by
+ * ->reloc_mutex and the second call will simply update the
+ * ->last_trans on the reloc root to the current transid, making
+ * it essentially a no-op.
+ *
+ * We're using ->fs_roots_radix_lock here because
+ * btrfs_drop_snapshot will clear the tag so we don't update the
+ * root, and we will clear the RECORDED flag there without the
+ * ->reloc_mutex lock held. Again this is fine because we won't
+ * be messing with a dropped root anymore, but for concurrency
+ * it's best to keep the rules nice and simple.
*/
- set_bit(BTRFS_ROOT_IN_TRANS_SETUP, &root->state);
-
- /* make sure readers find IN_TRANS_SETUP before
- * they find our root->last_trans update
- */
- smp_wmb();
+ ret = btrfs_init_reloc_root(trans, root);

spin_lock(&fs_info->fs_roots_radix_lock);
- if (root->last_trans == trans->transid && !force) {
+ if (test_bit(BTRFS_ROOT_RECORDED, &root->state) && !force) {
spin_unlock(&fs_info->fs_roots_radix_lock);
return 0;
}
radix_tree_tag_set(&fs_info->fs_roots_radix,
(unsigned long)root->root_key.objectid,
BTRFS_ROOT_TRANS_TAG);
- spin_unlock(&fs_info->fs_roots_radix_lock);
+ set_bit(BTRFS_ROOT_RECORDED, &root->state);
root->last_trans = trans->transid;
-
- /* this is pretty tricky. We don't want to
- * take the relocation lock in btrfs_record_root_in_trans
- * unless we're really doing the first setup for this root in
- * this transaction.
- *
- * Normally we'd use root->last_trans as a flag to decide
- * if we want to take the expensive mutex.
- *
- * But, we have to set root->last_trans before we
- * init the relocation root, otherwise, we trip over warnings
- * in ctree.c. The solution used here is to flag ourselves
- * with root IN_TRANS_SETUP. When this is 1, we're still
- * fixing up the reloc trees and everyone must wait.
- *
- * When this is zero, they can trust root->last_trans and fly
- * through btrfs_record_root_in_trans without having to take the
- * lock. smp_wmb() makes sure that all the writes above are
- * done before we pop in the zero below
- */
- ret = btrfs_init_reloc_root(trans, root);
- smp_mb__before_atomic();
- clear_bit(BTRFS_ROOT_IN_TRANS_SETUP, &root->state);
+ spin_unlock(&fs_info->fs_roots_radix_lock);
}
return ret;
}
@@ -476,6 +465,7 @@ void btrfs_add_dropped_root(struct btrfs_trans_handle *trans,
radix_tree_tag_clear(&fs_info->fs_roots_radix,
(unsigned long)root->root_key.objectid,
BTRFS_ROOT_TRANS_TAG);
+ clear_bit(BTRFS_ROOT_RECORDED, &root->state);
spin_unlock(&fs_info->fs_roots_radix_lock);
}

@@ -488,13 +478,7 @@ int btrfs_record_root_in_trans(struct btrfs_trans_handle *trans,
if (!test_bit(BTRFS_ROOT_SHAREABLE, &root->state))
return 0;

- /*
- * see record_root_in_trans for comments about IN_TRANS_SETUP usage
- * and barriers
- */
- smp_rmb();
- if (root->last_trans == trans->transid &&
- !test_bit(BTRFS_ROOT_IN_TRANS_SETUP, &root->state))
+ if (test_bit(BTRFS_ROOT_RECORDED, &root->state))
return 0;

mutex_lock(&fs_info->reloc_mutex);
@@ -1531,6 +1515,7 @@ static noinline int commit_fs_roots(struct btrfs_trans_handle *trans)
radix_tree_tag_clear(&fs_info->fs_roots_radix,
(unsigned long)root->root_key.objectid,
BTRFS_ROOT_TRANS_TAG);
+ clear_bit(BTRFS_ROOT_RECORDED, &root->state);
spin_unlock(&fs_info->fs_roots_radix_lock);

btrfs_free_log(trans, root);
--
2.43.0

2023-12-13 18:06:39

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [paulmck-rcu:frederic.2023.12.08a 29/37] fs/btrfs/transaction.c:496:6: error: call to '__compiletime_assert_329' declared with 'error' attribute: Need native word sized stores/loads for atomicity.

On Wed, Dec 13, 2023 at 10:54:40AM -0500, Josef Bacik wrote:
> On Wed, Dec 13, 2023 at 06:56:37AM -0800, Paul E. McKenney wrote:
> > On Wed, Dec 13, 2023 at 01:53:58PM +0100, David Sterba wrote:
> > > On Sat, Dec 09, 2023 at 07:51:30AM -0800, Paul E. McKenney wrote:
> > > > On Sat, Dec 09, 2023 at 06:20:37PM +0800, kernel test robot wrote:
> > > > > tree: https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git frederic.2023.12.08a
> > > > > head: 37843b5f561a08ae899fb791eeeb5abd992eabe2
> > > > > commit: 7dd87072d40809e26503f04b79d63290288dbbac [29/37] btrfs: Adjust ->last_trans ordering in btrfs_record_root_in_trans()
> > > > > config: riscv-rv32_defconfig (https://download.01.org/0day-ci/archive/20231209/[email protected]/config)
> > > > > compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project.git 4a5ac14ee968ff0ad5d2cc1ffa0299048db4c88a)
> > > > > reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231209/[email protected]/reproduce)
> > > > >
> > > > > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > > > > the same patch/commit), kindly add following tags
> > > > > | Reported-by: kernel test robot <[email protected]>
> > > > > | Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
> > > > >
> > > > > All errors (new ones prefixed by >>):
> > > > >
> > > > > warning: unknown warning option '-Wpacked-not-aligned'; did you mean '-Wpacked-non-pod'? [-Wunknown-warning-option]
> > > > > warning: unknown warning option '-Wstringop-truncation'; did you mean '-Wstring-concatenation'? [-Wunknown-warning-option]
> > > > > warning: unknown warning option '-Wmaybe-uninitialized'; did you mean '-Wuninitialized'? [-Wunknown-warning-option]
> > > > > >> fs/btrfs/transaction.c:496:6: error: call to '__compiletime_assert_329' declared with 'error' attribute: Need native word sized stores/loads for atomicity.
> > > > > 496 | if (smp_load_acquire(&root->last_trans) == trans->transid && /* ^^^ */
> > > > > | ^
> > > >
> > > > Ooooh!!! :-/
> > > >
> > > > From what I can see, the current code can tear this load on 32-bit
> > > > systems, which can result in bad comparisons and then in failure to wait
> > > > for a partially complete transaction.
> > > >
> > > > So is btrfs actually supported on 32-bit systems? If not, would the
> > > > following patch be appropriate?
> > >
> > > There are limitations on 32bit systems, eg. due to shorter inode numbers
> > > (ino_t is unsigned long) and that radix-tree/xarray does support only
> > > unsigned long keys, while we have 64bit identifiers for inodes or tree
> > > roots.
> > >
> > > So far we support that and dropping it completely is I think a big deal,
> > > like with any possibly used feature. What I've seen there are NAS boxes
> > > with low power ARM that are 32bit.
> > >
> > > > If btrfs is to be supported on 32-bit systems, from what I can see some
> > > > major surgery is required, even if a 32-bit counter is wrap-safe for
> > > > this particular type of transaction. (But SSDs? In-memory btrfs
> > > > filesystems?)
> > >
> > > We won't probably do any major surgery to support 32bit systems.
> >
> > Got it, and thank you for the background! My takeaway is that 32-bit
> > BTRFS must work in the common case, but might have issues on some
> > workloads, for example, running out of inode numbers or load tearing.
> >
> > > > ------------------------------------------------------------------------
> > > >
> > > > diff --git a/fs/btrfs/Kconfig b/fs/btrfs/Kconfig
> > > > index 4fb925e8c981..4d56158c34f9 100644
> > > > --- a/fs/btrfs/Kconfig
> > > > +++ b/fs/btrfs/Kconfig
> > > > @@ -19,6 +19,7 @@ config BTRFS_FS
> > > > select RAID6_PQ
> > > > select XOR_BLOCKS
> > > > depends on PAGE_SIZE_LESS_THAN_256KB
> > > > + depends on 64BIT
> > >
> > > Can we keep the current inefficient smp_* barriers instead of dropping
> > > the whole 32bit support as an alternative. If the smp_load_acquire are
> > > better but not strictly necessary for the correctness (from the barriers
> > > POV) I'd suggest to leave it as-is. We can put comments in case somebody
> > > wants to optimize it in the future again.
> >
> > We still have the barrier placement issue, given that smp_rmb() enforces
> > only the ordering of earlier and later loads, correct? Or am I missing
> > some other ordering constraint that makes all that work?
> >
> > But I can make each of the current patch's smp_load_acquire() call instead
> > be be a READ_ONCE() followed by an smp_rmb(), the test_bit_acquire()
> > call be test_bit() followed by smp_rmb(), and the smp_store_release()
> > call be an smp_wmb() followed by a WRITE_ONCE(). This provides the needed
> > ordering, bullet-proofs the 64-bit code against compilers, but works on
> > 32-bit systems. For example, on a 32-bit system the 64-bit READ_ONCE()
> > and WRITE_ONCE() might still be compiled as a pair of 32-bit memory
> > accesses, but they will both be guaranteed to be single memory accesses
> > on 64-bit systems.
> >
> > Would that work for you guys?
>
> Actually I think we just need to re-work all of this to make it less silly.
> Does this look reasonable to you Paul? I still have to test it, but I think it
> addresses your concerns and lets us keep 32bit support ;). Thanks,

Things that avoid the need for memory barriers are often improvements!

I don't claim to understand enough of the BTRFS code to fully judge
this, but I do ask one stupid question below just in case it inspires
a non-stupid idea. ;-)

Thanx, Paul

> Josef
>
> From: Josef Bacik <[email protected]>
> Date: Wed, 13 Dec 2023 10:46:57 -0500
> Subject: [PATCH] btrfs: cleanup how we record roots in the transaction
>
> Previously we didn't have a bit field for tracking if a root was
> recorded in the transaction, it was simply an integer on the root.
> Additionally we have the ->last_trans which we record the last
> transaction this root was modified in. We had various memory barriers
> to attempt to synchronize these two updates so we could avoid taking the
> ->reloc_mutex all of the time for transaction starts.
>
> However this code has changed a lot, but the dubious memory ordering has
> remained. Fix up this code to make it more clear and get rid of the
> memory barriers. All we're trying to avoid here is constantly taking
> the ->reloc_mutex, so add a new bit to mark that the root has been
> recorded. If this bit is set then we can skip the expensive step,
> otherwise we have to take the lock and wait.
>
> Add a big comment indicating why we're doing the ordering and locking
> the way we are, and use ->fs_roots_radix_lock as the arbiter of this new
> bit.
>
> This makes the code simpler, removes the memory barriers, and documents
> clearly what we're doing so it's less confusing for future readers.
>
> Signed-off-by: Josef Bacik <[email protected]>
> ---
> fs/btrfs/ctree.h | 11 ++++---
> fs/btrfs/disk-io.c | 1 +
> fs/btrfs/transaction.c | 69 +++++++++++++++++-------------------------
> 3 files changed, 34 insertions(+), 47 deletions(-)
>
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index 70e828d33177..eb44cf191863 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -90,12 +90,13 @@ struct btrfs_path {
> */
> enum {
> /*
> - * btrfs_record_root_in_trans is a multi-step process, and it can race
> - * with the balancing code. But the race is very small, and only the
> - * first time the root is added to each transaction. So IN_TRANS_SETUP
> - * is used to tell us when more checks are required
> + * We set this after we've recorded the root in the transaction. This
> + * is used to avoid taking the reloc_mutex every time we call
> + * btrfs_start_transaction(root). Once the root has been recorded we
> + * don't have to do the setup work again, and this is cleared on
> + * transaction commit once we're done with the root.
> */
> - BTRFS_ROOT_IN_TRANS_SETUP,
> + BTRFS_ROOT_RECORDED,
>
> /*
> * Set if tree blocks of this root can be shared by other roots.
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index c6907d533fe8..0410fac5f78e 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -4805,6 +4805,7 @@ static void btrfs_free_all_qgroup_pertrans(struct btrfs_fs_info *fs_info)
> radix_tree_tag_clear(&fs_info->fs_roots_radix,
> (unsigned long)root->root_key.objectid,
> BTRFS_ROOT_TRANS_TAG);
> + clear_bit(BTRFS_ROOT_RECORDED, &root->state);
> }
> }
> spin_unlock(&fs_info->fs_roots_radix_lock);
> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> index 5b3333ceef04..186089ddac27 100644
> --- a/fs/btrfs/transaction.c
> +++ b/fs/btrfs/transaction.c
> @@ -407,54 +407,43 @@ static int record_root_in_trans(struct btrfs_trans_handle *trans,
> int ret = 0;
>
> if ((test_bit(BTRFS_ROOT_SHAREABLE, &root->state) &&
> - root->last_trans < trans->transid) || force) {
> + !test_bit(BTRFS_ROOT_RECORDED, &root->state)) || force) {

Would it make sense to test both bits at once? Something like:

if ((READ_ONCE(root->state) & (BIT(BTRFS_ROOT_SHAREABLE) | BIT(BTRFS_ROOT_RECORDED))) == BIT(BTRFS_ROOT_SHAREABLE))

That way you get an atomic test of both bits without the memory-ordering
aggravation. Though you already hold the mutex here, so this should
not matter in this case, hence the question being stupid. ;-)

> WARN_ON(!force && root->commit_root != root->node);
>
> /*
> - * see below for IN_TRANS_SETUP usage rules
> - * we have the reloc mutex held now, so there
> - * is only one writer in this function
> + * We are using ->fs_roots_radix_lock as the ultimate protector
> + * of BTRFS_ROOT_RECORDED, which means we want to set it once
> + * we're completely done, so we call btrfs_init_reloc_root()
> + * first.
> + *
> + * This is ok because we are protected here by ->reloc_mutex, so
> + * in reality we won't ever race and call
> + * btrfs_init_reloc_root() twice on the same root back to back.
> + * However if we do it's ok, because again we're protected by
> + * ->reloc_mutex and the second call will simply update the
> + * ->last_trans on the reloc root to the current transid, making
> + * it essentially a no-op.
> + *
> + * We're using ->fs_roots_radix_lock here because
> + * btrfs_drop_snapshot will clear the tag so we don't update the
> + * root, and we will clear the RECORDED flag there without the
> + * ->reloc_mutex lock held. Again this is fine because we won't
> + * be messing with a dropped root anymore, but for concurrency
> + * it's best to keep the rules nice and simple.
> */
> - set_bit(BTRFS_ROOT_IN_TRANS_SETUP, &root->state);
> -
> - /* make sure readers find IN_TRANS_SETUP before
> - * they find our root->last_trans update
> - */
> - smp_wmb();
> + ret = btrfs_init_reloc_root(trans, root);
>
> spin_lock(&fs_info->fs_roots_radix_lock);
> - if (root->last_trans == trans->transid && !force) {
> + if (test_bit(BTRFS_ROOT_RECORDED, &root->state) && !force) {
> spin_unlock(&fs_info->fs_roots_radix_lock);
> return 0;
> }
> radix_tree_tag_set(&fs_info->fs_roots_radix,
> (unsigned long)root->root_key.objectid,
> BTRFS_ROOT_TRANS_TAG);
> - spin_unlock(&fs_info->fs_roots_radix_lock);
> + set_bit(BTRFS_ROOT_RECORDED, &root->state);
> root->last_trans = trans->transid;
> -
> - /* this is pretty tricky. We don't want to
> - * take the relocation lock in btrfs_record_root_in_trans
> - * unless we're really doing the first setup for this root in
> - * this transaction.
> - *
> - * Normally we'd use root->last_trans as a flag to decide
> - * if we want to take the expensive mutex.
> - *
> - * But, we have to set root->last_trans before we
> - * init the relocation root, otherwise, we trip over warnings
> - * in ctree.c. The solution used here is to flag ourselves
> - * with root IN_TRANS_SETUP. When this is 1, we're still
> - * fixing up the reloc trees and everyone must wait.
> - *
> - * When this is zero, they can trust root->last_trans and fly
> - * through btrfs_record_root_in_trans without having to take the
> - * lock. smp_wmb() makes sure that all the writes above are
> - * done before we pop in the zero below
> - */
> - ret = btrfs_init_reloc_root(trans, root);
> - smp_mb__before_atomic();
> - clear_bit(BTRFS_ROOT_IN_TRANS_SETUP, &root->state);
> + spin_unlock(&fs_info->fs_roots_radix_lock);
> }
> return ret;
> }
> @@ -476,6 +465,7 @@ void btrfs_add_dropped_root(struct btrfs_trans_handle *trans,
> radix_tree_tag_clear(&fs_info->fs_roots_radix,
> (unsigned long)root->root_key.objectid,
> BTRFS_ROOT_TRANS_TAG);
> + clear_bit(BTRFS_ROOT_RECORDED, &root->state);
> spin_unlock(&fs_info->fs_roots_radix_lock);
> }
>
> @@ -488,13 +478,7 @@ int btrfs_record_root_in_trans(struct btrfs_trans_handle *trans,
> if (!test_bit(BTRFS_ROOT_SHAREABLE, &root->state))
> return 0;
>
> - /*
> - * see record_root_in_trans for comments about IN_TRANS_SETUP usage
> - * and barriers
> - */
> - smp_rmb();
> - if (root->last_trans == trans->transid &&
> - !test_bit(BTRFS_ROOT_IN_TRANS_SETUP, &root->state))
> + if (test_bit(BTRFS_ROOT_RECORDED, &root->state))
> return 0;
>
> mutex_lock(&fs_info->reloc_mutex);
> @@ -1531,6 +1515,7 @@ static noinline int commit_fs_roots(struct btrfs_trans_handle *trans)
> radix_tree_tag_clear(&fs_info->fs_roots_radix,
> (unsigned long)root->root_key.objectid,
> BTRFS_ROOT_TRANS_TAG);
> + clear_bit(BTRFS_ROOT_RECORDED, &root->state);
> spin_unlock(&fs_info->fs_roots_radix_lock);
>
> btrfs_free_log(trans, root);
> --
> 2.43.0
>

2023-12-13 19:56:37

by Josef Bacik

[permalink] [raw]
Subject: Re: [paulmck-rcu:frederic.2023.12.08a 29/37] fs/btrfs/transaction.c:496:6: error: call to '__compiletime_assert_329' declared with 'error' attribute: Need native word sized stores/loads for atomicity.

On Wed, Dec 13, 2023 at 1:05 PM Paul E. McKenney <[email protected]> wrote:
>
> On Wed, Dec 13, 2023 at 10:54:40AM -0500, Josef Bacik wrote:
> > On Wed, Dec 13, 2023 at 06:56:37AM -0800, Paul E. McKenney wrote:
> > > On Wed, Dec 13, 2023 at 01:53:58PM +0100, David Sterba wrote:
> > > > On Sat, Dec 09, 2023 at 07:51:30AM -0800, Paul E. McKenney wrote:
> > > > > On Sat, Dec 09, 2023 at 06:20:37PM +0800, kernel test robot wrote:
> > > > > > tree: https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git frederic.2023.12.08a
> > > > > > head: 37843b5f561a08ae899fb791eeeb5abd992eabe2
> > > > > > commit: 7dd87072d40809e26503f04b79d63290288dbbac [29/37] btrfs: Adjust ->last_trans ordering in btrfs_record_root_in_trans()
> > > > > > config: riscv-rv32_defconfig (https://download.01.org/0day-ci/archive/20231209/[email protected]/config)
> > > > > > compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project.git 4a5ac14ee968ff0ad5d2cc1ffa0299048db4c88a)
> > > > > > reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231209/[email protected]/reproduce)
> > > > > >
> > > > > > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > > > > > the same patch/commit), kindly add following tags
> > > > > > | Reported-by: kernel test robot <[email protected]>
> > > > > > | Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
> > > > > >
> > > > > > All errors (new ones prefixed by >>):
> > > > > >
> > > > > > warning: unknown warning option '-Wpacked-not-aligned'; did you mean '-Wpacked-non-pod'? [-Wunknown-warning-option]
> > > > > > warning: unknown warning option '-Wstringop-truncation'; did you mean '-Wstring-concatenation'? [-Wunknown-warning-option]
> > > > > > warning: unknown warning option '-Wmaybe-uninitialized'; did you mean '-Wuninitialized'? [-Wunknown-warning-option]
> > > > > > >> fs/btrfs/transaction.c:496:6: error: call to '__compiletime_assert_329' declared with 'error' attribute: Need native word sized stores/loads for atomicity.
> > > > > > 496 | if (smp_load_acquire(&root->last_trans) == trans->transid && /* ^^^ */
> > > > > > | ^
> > > > >
> > > > > Ooooh!!! :-/
> > > > >
> > > > > From what I can see, the current code can tear this load on 32-bit
> > > > > systems, which can result in bad comparisons and then in failure to wait
> > > > > for a partially complete transaction.
> > > > >
> > > > > So is btrfs actually supported on 32-bit systems? If not, would the
> > > > > following patch be appropriate?
> > > >
> > > > There are limitations on 32bit systems, eg. due to shorter inode numbers
> > > > (ino_t is unsigned long) and that radix-tree/xarray does support only
> > > > unsigned long keys, while we have 64bit identifiers for inodes or tree
> > > > roots.
> > > >
> > > > So far we support that and dropping it completely is I think a big deal,
> > > > like with any possibly used feature. What I've seen there are NAS boxes
> > > > with low power ARM that are 32bit.
> > > >
> > > > > If btrfs is to be supported on 32-bit systems, from what I can see some
> > > > > major surgery is required, even if a 32-bit counter is wrap-safe for
> > > > > this particular type of transaction. (But SSDs? In-memory btrfs
> > > > > filesystems?)
> > > >
> > > > We won't probably do any major surgery to support 32bit systems.
> > >
> > > Got it, and thank you for the background! My takeaway is that 32-bit
> > > BTRFS must work in the common case, but might have issues on some
> > > workloads, for example, running out of inode numbers or load tearing.
> > >
> > > > > ------------------------------------------------------------------------
> > > > >
> > > > > diff --git a/fs/btrfs/Kconfig b/fs/btrfs/Kconfig
> > > > > index 4fb925e8c981..4d56158c34f9 100644
> > > > > --- a/fs/btrfs/Kconfig
> > > > > +++ b/fs/btrfs/Kconfig
> > > > > @@ -19,6 +19,7 @@ config BTRFS_FS
> > > > > select RAID6_PQ
> > > > > select XOR_BLOCKS
> > > > > depends on PAGE_SIZE_LESS_THAN_256KB
> > > > > + depends on 64BIT
> > > >
> > > > Can we keep the current inefficient smp_* barriers instead of dropping
> > > > the whole 32bit support as an alternative. If the smp_load_acquire are
> > > > better but not strictly necessary for the correctness (from the barriers
> > > > POV) I'd suggest to leave it as-is. We can put comments in case somebody
> > > > wants to optimize it in the future again.
> > >
> > > We still have the barrier placement issue, given that smp_rmb() enforces
> > > only the ordering of earlier and later loads, correct? Or am I missing
> > > some other ordering constraint that makes all that work?
> > >
> > > But I can make each of the current patch's smp_load_acquire() call instead
> > > be be a READ_ONCE() followed by an smp_rmb(), the test_bit_acquire()
> > > call be test_bit() followed by smp_rmb(), and the smp_store_release()
> > > call be an smp_wmb() followed by a WRITE_ONCE(). This provides the needed
> > > ordering, bullet-proofs the 64-bit code against compilers, but works on
> > > 32-bit systems. For example, on a 32-bit system the 64-bit READ_ONCE()
> > > and WRITE_ONCE() might still be compiled as a pair of 32-bit memory
> > > accesses, but they will both be guaranteed to be single memory accesses
> > > on 64-bit systems.
> > >
> > > Would that work for you guys?
> >
> > Actually I think we just need to re-work all of this to make it less silly.
> > Does this look reasonable to you Paul? I still have to test it, but I think it
> > addresses your concerns and lets us keep 32bit support ;). Thanks,
>
> Things that avoid the need for memory barriers are often improvements!
>
> I don't claim to understand enough of the BTRFS code to fully judge
> this, but I do ask one stupid question below just in case it inspires
> a non-stupid idea. ;-)
>

Not a stupid question, I see what you're getting at. We mix short
term flags and long term flags on ->state. SHAREABLE in this case is
set at allocation time and never changes, so I'm not worried about
that changing ever. The only thing here that actually changes is the
RECORDED flag, so just the normal test_bit for that is fine. Thanks,

Josef

2023-12-13 20:16:06

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [paulmck-rcu:frederic.2023.12.08a 29/37] fs/btrfs/transaction.c:496:6: error: call to '__compiletime_assert_329' declared with 'error' attribute: Need native word sized stores/loads for atomicity.

On Wed, Dec 13, 2023 at 02:53:52PM -0500, Josef Bacik wrote:
> On Wed, Dec 13, 2023 at 1:05 PM Paul E. McKenney <[email protected]> wrote:
> >
> > On Wed, Dec 13, 2023 at 10:54:40AM -0500, Josef Bacik wrote:
> > > On Wed, Dec 13, 2023 at 06:56:37AM -0800, Paul E. McKenney wrote:
> > > > On Wed, Dec 13, 2023 at 01:53:58PM +0100, David Sterba wrote:
> > > > > On Sat, Dec 09, 2023 at 07:51:30AM -0800, Paul E. McKenney wrote:
> > > > > > On Sat, Dec 09, 2023 at 06:20:37PM +0800, kernel test robot wrote:
> > > > > > > tree: https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git frederic.2023.12.08a
> > > > > > > head: 37843b5f561a08ae899fb791eeeb5abd992eabe2
> > > > > > > commit: 7dd87072d40809e26503f04b79d63290288dbbac [29/37] btrfs: Adjust ->last_trans ordering in btrfs_record_root_in_trans()
> > > > > > > config: riscv-rv32_defconfig (https://download.01.org/0day-ci/archive/20231209/[email protected]/config)
> > > > > > > compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project.git 4a5ac14ee968ff0ad5d2cc1ffa0299048db4c88a)
> > > > > > > reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231209/[email protected]/reproduce)
> > > > > > >
> > > > > > > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > > > > > > the same patch/commit), kindly add following tags
> > > > > > > | Reported-by: kernel test robot <[email protected]>
> > > > > > > | Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
> > > > > > >
> > > > > > > All errors (new ones prefixed by >>):
> > > > > > >
> > > > > > > warning: unknown warning option '-Wpacked-not-aligned'; did you mean '-Wpacked-non-pod'? [-Wunknown-warning-option]
> > > > > > > warning: unknown warning option '-Wstringop-truncation'; did you mean '-Wstring-concatenation'? [-Wunknown-warning-option]
> > > > > > > warning: unknown warning option '-Wmaybe-uninitialized'; did you mean '-Wuninitialized'? [-Wunknown-warning-option]
> > > > > > > >> fs/btrfs/transaction.c:496:6: error: call to '__compiletime_assert_329' declared with 'error' attribute: Need native word sized stores/loads for atomicity.
> > > > > > > 496 | if (smp_load_acquire(&root->last_trans) == trans->transid && /* ^^^ */
> > > > > > > | ^
> > > > > >
> > > > > > Ooooh!!! :-/
> > > > > >
> > > > > > From what I can see, the current code can tear this load on 32-bit
> > > > > > systems, which can result in bad comparisons and then in failure to wait
> > > > > > for a partially complete transaction.
> > > > > >
> > > > > > So is btrfs actually supported on 32-bit systems? If not, would the
> > > > > > following patch be appropriate?
> > > > >
> > > > > There are limitations on 32bit systems, eg. due to shorter inode numbers
> > > > > (ino_t is unsigned long) and that radix-tree/xarray does support only
> > > > > unsigned long keys, while we have 64bit identifiers for inodes or tree
> > > > > roots.
> > > > >
> > > > > So far we support that and dropping it completely is I think a big deal,
> > > > > like with any possibly used feature. What I've seen there are NAS boxes
> > > > > with low power ARM that are 32bit.
> > > > >
> > > > > > If btrfs is to be supported on 32-bit systems, from what I can see some
> > > > > > major surgery is required, even if a 32-bit counter is wrap-safe for
> > > > > > this particular type of transaction. (But SSDs? In-memory btrfs
> > > > > > filesystems?)
> > > > >
> > > > > We won't probably do any major surgery to support 32bit systems.
> > > >
> > > > Got it, and thank you for the background! My takeaway is that 32-bit
> > > > BTRFS must work in the common case, but might have issues on some
> > > > workloads, for example, running out of inode numbers or load tearing.
> > > >
> > > > > > ------------------------------------------------------------------------
> > > > > >
> > > > > > diff --git a/fs/btrfs/Kconfig b/fs/btrfs/Kconfig
> > > > > > index 4fb925e8c981..4d56158c34f9 100644
> > > > > > --- a/fs/btrfs/Kconfig
> > > > > > +++ b/fs/btrfs/Kconfig
> > > > > > @@ -19,6 +19,7 @@ config BTRFS_FS
> > > > > > select RAID6_PQ
> > > > > > select XOR_BLOCKS
> > > > > > depends on PAGE_SIZE_LESS_THAN_256KB
> > > > > > + depends on 64BIT
> > > > >
> > > > > Can we keep the current inefficient smp_* barriers instead of dropping
> > > > > the whole 32bit support as an alternative. If the smp_load_acquire are
> > > > > better but not strictly necessary for the correctness (from the barriers
> > > > > POV) I'd suggest to leave it as-is. We can put comments in case somebody
> > > > > wants to optimize it in the future again.
> > > >
> > > > We still have the barrier placement issue, given that smp_rmb() enforces
> > > > only the ordering of earlier and later loads, correct? Or am I missing
> > > > some other ordering constraint that makes all that work?
> > > >
> > > > But I can make each of the current patch's smp_load_acquire() call instead
> > > > be be a READ_ONCE() followed by an smp_rmb(), the test_bit_acquire()
> > > > call be test_bit() followed by smp_rmb(), and the smp_store_release()
> > > > call be an smp_wmb() followed by a WRITE_ONCE(). This provides the needed
> > > > ordering, bullet-proofs the 64-bit code against compilers, but works on
> > > > 32-bit systems. For example, on a 32-bit system the 64-bit READ_ONCE()
> > > > and WRITE_ONCE() might still be compiled as a pair of 32-bit memory
> > > > accesses, but they will both be guaranteed to be single memory accesses
> > > > on 64-bit systems.
> > > >
> > > > Would that work for you guys?
> > >
> > > Actually I think we just need to re-work all of this to make it less silly.
> > > Does this look reasonable to you Paul? I still have to test it, but I think it
> > > addresses your concerns and lets us keep 32bit support ;). Thanks,
> >
> > Things that avoid the need for memory barriers are often improvements!
> >
> > I don't claim to understand enough of the BTRFS code to fully judge
> > this, but I do ask one stupid question below just in case it inspires
> > a non-stupid idea. ;-)
>
> Not a stupid question, I see what you're getting at. We mix short
> term flags and long term flags on ->state. SHAREABLE in this case is
> set at allocation time and never changes, so I'm not worried about
> that changing ever. The only thing here that actually changes is the
> RECORDED flag, so just the normal test_bit for that is fine. Thanks,

In that case, looks good to me:

Acked-by: Paul E. McKenney <[email protected]>

Thanx, Paul