Hi Jianan and Jianhua,
On Tue, Nov 23, 2021 at 11:58:32AM +0800, Huang Jianan wrote:
> 在 2021/11/23 10:59, Jianhua1 Hao 郝建华 via Linux-erofs 写道:
> > *We also found that it is easy to cause deadlock in the kswap scene, We
> > observed the following deadlock in the stress test under low memory
> > scenario,****Same as "erofs: fix deadlock when shrink erofs slab".*
> > **
> >
> > Thread A: Thread B:
> >
> > erofs_try_to_release_workgroup(grp =
> > 0xFFFFFF87ADFEE610)erofs_insert_workgroup()
> >
> > erofs_workgroup_try_to_freeze(grp, 1)//xa lock is held here
> >
> > //set ref count to EROFS_LOCKED_MAGICxa_lock(&sbi->managed_pslots);
> >
> > atomic_cmpxchg(&grp->refcount, val,EROFS_LOCKED_MAGIC)pre =
> > __xa_cmpxchg(&sbi->managed_pslots, grp->index, NULL, grp, GFP_NOFS);
> >
> > xa_erase(&sbi->managed_pslots, grp->index)erofs_workgroup_get(pre)
> > //pre = grp = 0xFFFFFF87ADFEE610
> >
> > //stuck there to wait for xa lock, already held by thread
> > Berofs_wait_on_workgroup_freezed(grp);
> >
> > xa_lock(xa); //wait ref count to be unlocked, which should be done by
> > thread A
> >
> > atomic_cond_read_relaxed(&grp->refcount, VAL != EROFS_LOCKED_MAGIC);
> >
> > Follow-up fix:it need to hold the xa lock before freeze the workgroup
> >
> > beacuse we will operate xarry?
> >
> Hi, JianHua,
>
> The fix is in the patch, please test it kindly if you have condition.
> https://lore.kernel.org/linux-erofs/YZcJpDs3FKpSfzAE@B-P7TQMD6M-0146/T/#t
Thanks for the report, I had some other work to do just now.
I've pushed out this patch to fixes branch and will send to Linus this
week:
https://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs.git/commit/?id=deccd444d2844f1e89314dfc3956cccfdb813b65
As Jianan said, I believe this patch can fix your issue and please take
a try in advance. Also, it doesn't effect v4.19 and v5.4 LTS, only v5.10
and v5.15 LTS are impacted.
Thanks for your report!
Gao Xiang