Check lru_cache_disabled under bh_lru_lock. Otherwise, it could
introduce race below and it fails to migrate pages containing
buffer_head.
CPU 0 CPU 1
bh_lru_install
lru_cache_disable
lru_cache_disabled = false
atomic_inc(&lru_disable_count);
invalidate_bh_lrus_cpu of CPU 0
bh_lru_lock
__invalidate_bh_lrus
bh_lru_unlock
bh_lru_lock
install the bh
bh_lru_unlock
Fixes: 8cc621d2f45d (mm: fs: invalidate BH LRU during page migration)
Signed-off-by: Minchan Kim <[email protected]>
---
fs/buffer.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/fs/buffer.c b/fs/buffer.c
index 8e112b6bd371..c76a8ef60a75 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -1235,16 +1235,18 @@ static void bh_lru_install(struct buffer_head *bh)
int i;
check_irqs_on();
+ bh_lru_lock();
+
/*
* the refcount of buffer_head in bh_lru prevents dropping the
* attached page(i.e., try_to_free_buffers) so it could cause
* failing page migration.
* Skip putting upcoming bh into bh_lru until migration is done.
*/
- if (lru_cache_disabled())
+ if (lru_cache_disabled()) {
+ bh_lru_unlock();
return;
-
- bh_lru_lock();
+ }
b = this_cpu_ptr(&bh_lrus);
for (i = 0; i < BH_LRU_SIZE; i++) {
--
2.35.1.616.g0bdcbb4464-goog
On Tue, 8 Mar 2022 10:07:09 -0800 Minchan Kim <[email protected]> wrote:
> Check lru_cache_disabled under bh_lru_lock. Otherwise, it could
> introduce race below and it fails to migrate pages containing
> buffer_head.
>
> CPU 0 CPU 1
>
> bh_lru_install
> lru_cache_disable
> lru_cache_disabled = false
> atomic_inc(&lru_disable_count);
> invalidate_bh_lrus_cpu of CPU 0
> bh_lru_lock
> __invalidate_bh_lrus
> bh_lru_unlock
> bh_lru_lock
> install the bh
> bh_lru_unlock
What are the user-visible runtime effects of this bug?
Is a cc:stable needed?
Should there be a reported-by?
Thanks.
On Wed, Mar 09, 2022 at 02:06:27PM -0800, Andrew Morton wrote:
> On Tue, 8 Mar 2022 10:07:09 -0800 Minchan Kim <[email protected]> wrote:
>
> > Check lru_cache_disabled under bh_lru_lock. Otherwise, it could
> > introduce race below and it fails to migrate pages containing
> > buffer_head.
> >
> > CPU 0 CPU 1
> >
> > bh_lru_install
> > lru_cache_disable
> > lru_cache_disabled = false
> > atomic_inc(&lru_disable_count);
> > invalidate_bh_lrus_cpu of CPU 0
> > bh_lru_lock
> > __invalidate_bh_lrus
> > bh_lru_unlock
> > bh_lru_lock
> > install the bh
> > bh_lru_unlock
>
> What are the user-visible runtime effects of this bug?
Once the race happens, CMA allocation fails, which is critical for
the workload CMA allocation depends.
>
> Is a cc:stable needed?
Ah, missed it. I think it would be rare to trigger the race considering
how CMA allocation would be rare but once it happens, it makes the CMA
allocation failure, which is critical for some. And the patch size is
small enough so I think it's worth to add in the stable.
>
> Should there be a reported-by?
I found it on my own while I reviewed Marcelo's other patchset so
I don't think we need to add my reported-by.
Andrew, please tell me if you want me resend it.