2020-12-15 12:32:43

by Ruan Shiyang

[permalink] [raw]
Subject: [RFC PATCH v3 0/9] fsdax: introduce fs query to support reflink

This patchset is a try to resolve the problem of tracking shared page
for fsdax.

Change from v2:
- Adjust the order of patches
- Divide the infrastructure and the drivers that use it
- Rebased to v5.10

Change from v1:
- Introduce ->block_lost() for block device
- Support mapped device
- Add 'not available' warning for realtime device in XFS
- Rebased to v5.10-rc1

This patchset moves owner tracking from dax_assocaite_entry() to pmem
device driver, by introducing an interface ->memory_failure() of struct
pagemap. This interface is called by memory_failure() in mm, and
implemented by pmem device. Then pmem device calls its ->corrupted_range()
to find the filesystem which the corrupted data located in, and call
filesystem handler to track files or metadata assocaited with this page.
Finally we are able to try to fix the corrupted data in filesystem and do
other necessary processing, such as killing processes who are using the
files affected.

The call trace is like this:
memory_failure()
pgmap->ops->memory_failure() => pmem_pgmap_memory_failure()
gendisk->fops->corrupted_range() => - pmem_corrupted_range()
- md_blk_corrupted_range()
sb->s_ops->currupted_range() => xfs_fs_corrupted_range()
xfs_rmap_query_range()
xfs_currupt_helper()
* corrupted on metadata
try to recover data, call xfs_force_shutdown()
* corrupted on file data
try to recover data, call mf_dax_mapping_kill_procs()

The fsdax & reflink support for XFS is not contained in this patchset.

(Rebased on v5.10)
--

Shiyang Ruan (9):
pagemap: Introduce ->memory_failure()
blk: Introduce ->corrupted_range() for block device
fs: Introduce ->corrupted_range() for superblock
mm, fsdax: Refactor memory-failure handler for dax mapping
mm, pmem: Implement ->memory_failure() in pmem driver
pmem: Implement ->corrupted_range() for pmem driver
dm: Introduce ->rmap() to find bdev offset
md: Implement ->corrupted_range()
xfs: Implement ->corrupted_range() for XFS

block/genhd.c | 12 +++
drivers/md/dm-linear.c | 8 ++
drivers/md/dm.c | 66 +++++++++++++++
drivers/nvdimm/pmem.c | 51 ++++++++++++
fs/block_dev.c | 21 +++++
fs/dax.c | 24 +++---
fs/xfs/xfs_fsops.c | 10 +++
fs/xfs/xfs_mount.h | 2 +
fs/xfs/xfs_super.c | 93 +++++++++++++++++++++
include/linux/blkdev.h | 2 +
include/linux/dax.h | 5 +-
include/linux/device-mapper.h | 2 +
include/linux/fs.h | 2 +
include/linux/genhd.h | 8 ++
include/linux/memremap.h | 8 ++
include/linux/mm.h | 9 ++
mm/memory-failure.c | 150 +++++++++++++++++++---------------
17 files changed, 391 insertions(+), 82 deletions(-)

--
2.29.2




2020-12-15 12:32:48

by Ruan Shiyang

[permalink] [raw]
Subject: [RFC PATCH v3 1/9] pagemap: Introduce ->memory_failure()

When memory-failure occurs, we call this function which is implemented
by each kind of devices. For the fsdax case, pmem device driver
implements it. Pmem device driver will find out the block device where
the error page locates in, and try to get the filesystem on this block
device. And finally call filesystem handler to deal with the error.
The filesystem will try to recover the corrupted data if possiable.

Signed-off-by: Shiyang Ruan <[email protected]>
---
include/linux/memremap.h | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/include/linux/memremap.h b/include/linux/memremap.h
index 79c49e7f5c30..0bcf2b1e20bd 100644
--- a/include/linux/memremap.h
+++ b/include/linux/memremap.h
@@ -87,6 +87,14 @@ struct dev_pagemap_ops {
* the page back to a CPU accessible page.
*/
vm_fault_t (*migrate_to_ram)(struct vm_fault *vmf);
+
+ /*
+ * Handle the memory failure happens on one page. Notify the processes
+ * who are using this page, and try to recover the data on this page
+ * if necessary.
+ */
+ int (*memory_failure)(struct dev_pagemap *pgmap, unsigned long pfn,
+ int flags);
};

#define PGMAP_ALTMAP_VALID (1 << 0)
--
2.29.2



2020-12-18 02:55:34

by Ruan Shiyang

[permalink] [raw]
Subject: Re: [RFC PATCH v3 0/9] fsdax: introduce fs query to support reflink



On 2020/12/17 上午4:55, Jane Chu wrote:
> Hi, Shiyang,
>
> On 12/15/2020 4:14 AM, Shiyang Ruan wrote:
>> The call trace is like this:
>> memory_failure()
>>   pgmap->ops->memory_failure()      => pmem_pgmap_memory_failure()
>>    gendisk->fops->corrupted_range() => - pmem_corrupted_range()
>>                                        - md_blk_corrupted_range()
>>     sb->s_ops->currupted_range()    => xfs_fs_corrupted_range()
>>      xfs_rmap_query_range()
>>       xfs_currupt_helper()
>>        * corrupted on metadata
>>            try to recover data, call xfs_force_shutdown()
>>        * corrupted on file data
>>            try to recover data, call mf_dax_mapping_kill_procs()
>>
>> The fsdax & reflink support for XFS is not contained in this patchset.
>>
>> (Rebased on v5.10)
>
> So I tried the patchset with pmem error injection, the SIGBUS payload
> does not look right -
>
> ** SIGBUS(7): **
> ** si_addr(0x(nil)), si_lsb(0xC), si_code(0x4, BUS_MCEERR_AR) **
>
> I expect the payload looks like
>
> ** si_addr(0x7f3672e00000), si_lsb(0x15), si_code(0x4, BUS_MCEERR_AR) **

Thanks for testing. I test the SIGBUS by writing a program which calls
madvise(... ,MADV_HWPOISON) to inject memory-failure. It just shows
that the program is killed by SIGBUS. I cannot get any detail from it.
So, could you please show me the right way(test tools) to test it?


--
Thanks,
Ruan Shiyang.

>
> thanks,
> -jane
>
>
>
>
>
>


2020-12-18 03:59:12

by Darrick J. Wong

[permalink] [raw]
Subject: Re: [RFC PATCH v3 0/9] fsdax: introduce fs query to support reflink

On Fri, Dec 18, 2020 at 10:44:26AM +0800, Ruan Shiyang wrote:
>
>
> On 2020/12/17 上午4:55, Jane Chu wrote:
> > Hi, Shiyang,
> >
> > On 12/15/2020 4:14 AM, Shiyang Ruan wrote:
> > > The call trace is like this:
> > > memory_failure()
> > >   pgmap->ops->memory_failure()      => pmem_pgmap_memory_failure()
> > >    gendisk->fops->corrupted_range() => - pmem_corrupted_range()
> > >                                        - md_blk_corrupted_range()
> > >     sb->s_ops->currupted_range()    => xfs_fs_corrupted_range()
> > >      xfs_rmap_query_range()
> > >       xfs_currupt_helper()
> > >        * corrupted on metadata
> > >            try to recover data, call xfs_force_shutdown()
> > >        * corrupted on file data
> > >            try to recover data, call mf_dax_mapping_kill_procs()
> > >
> > > The fsdax & reflink support for XFS is not contained in this patchset.
> > >
> > > (Rebased on v5.10)
> >
> > So I tried the patchset with pmem error injection, the SIGBUS payload
> > does not look right -
> >
> > ** SIGBUS(7): **
> > ** si_addr(0x(nil)), si_lsb(0xC), si_code(0x4, BUS_MCEERR_AR) **
> >
> > I expect the payload looks like
> >
> > ** si_addr(0x7f3672e00000), si_lsb(0x15), si_code(0x4, BUS_MCEERR_AR) **
>
> Thanks for testing. I test the SIGBUS by writing a program which calls
> madvise(... ,MADV_HWPOISON) to inject memory-failure. It just shows that
> the program is killed by SIGBUS. I cannot get any detail from it. So,
> could you please show me the right way(test tools) to test it?

I'm assuming that Jane is using a program that calls sigaction to
install a SIGBUS handler, and dumps the entire siginfo_t structure
whenever it receives one...

--D

>
> --
> Thanks,
> Ruan Shiyang.
>
> >
> > thanks,
> > -jane
> >
> >
> >
> >
> >
> >
>
>

2020-12-18 09:48:30

by Ruan Shiyang

[permalink] [raw]
Subject: Re: [RFC PATCH v3 0/9] fsdax: introduce fs query to support reflink



On 2020/12/18 上午11:49, Darrick J. Wong wrote:
> On Fri, Dec 18, 2020 at 10:44:26AM +0800, Ruan Shiyang wrote:
>>
>>
>> On 2020/12/17 上午4:55, Jane Chu wrote:
>>> Hi, Shiyang,
>>>
>>> On 12/15/2020 4:14 AM, Shiyang Ruan wrote:
>>>> The call trace is like this:
>>>> memory_failure()
>>>>   pgmap->ops->memory_failure()      => pmem_pgmap_memory_failure()
>>>>    gendisk->fops->corrupted_range() => - pmem_corrupted_range()
>>>>                                        - md_blk_corrupted_range()
>>>>     sb->s_ops->currupted_range()    => xfs_fs_corrupted_range()
>>>>      xfs_rmap_query_range()
>>>>       xfs_currupt_helper()
>>>>        * corrupted on metadata
>>>>            try to recover data, call xfs_force_shutdown()
>>>>        * corrupted on file data
>>>>            try to recover data, call mf_dax_mapping_kill_procs()
>>>>
>>>> The fsdax & reflink support for XFS is not contained in this patchset.
>>>>
>>>> (Rebased on v5.10)
>>>
>>> So I tried the patchset with pmem error injection, the SIGBUS payload
>>> does not look right -
>>>
>>> ** SIGBUS(7): **
>>> ** si_addr(0x(nil)), si_lsb(0xC), si_code(0x4, BUS_MCEERR_AR) **
>>>
>>> I expect the payload looks like
>>>
>>> ** si_addr(0x7f3672e00000), si_lsb(0x15), si_code(0x4, BUS_MCEERR_AR) **
>>
>> Thanks for testing. I test the SIGBUS by writing a program which calls
>> madvise(... ,MADV_HWPOISON) to inject memory-failure. It just shows that
>> the program is killed by SIGBUS. I cannot get any detail from it. So,
>> could you please show me the right way(test tools) to test it?
>
> I'm assuming that Jane is using a program that calls sigaction to
> install a SIGBUS handler, and dumps the entire siginfo_t structure
> whenever it receives one...

OK. Let me try it and figure out what's wrong in it.


--
Thanks,
Ruan Shiyang.

>
> --D
>
>>
>> --
>> Thanks,
>> Ruan Shiyang.
>>
>>>
>>> thanks,
>>> -jane
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>
>


2021-01-08 18:17:35

by Jane Chu

[permalink] [raw]
Subject: Re: [RFC PATCH v3 0/9] fsdax: introduce fs query to support reflink

Hi, Shiyang,

On 12/18/2020 1:13 AM, Ruan Shiyang wrote:
>>>>
>>>> So I tried the patchset with pmem error injection, the SIGBUS payload
>>>> does not look right -
>>>>
>>>> ** SIGBUS(7): **
>>>> ** si_addr(0x(nil)), si_lsb(0xC), si_code(0x4, BUS_MCEERR_AR) **
>>>>
>>>> I expect the payload looks like
>>>>
>>>> ** si_addr(0x7f3672e00000), si_lsb(0x15), si_code(0x4,
>>>> BUS_MCEERR_AR) **
>>>
>>> Thanks for testing.  I test the SIGBUS by writing a program which calls
>>> madvise(... ,MADV_HWPOISON) to inject memory-failure.  It just shows
>>> that
>>> the program is killed by SIGBUS.  I cannot get any detail from it.  So,
>>> could you please show me the right way(test tools) to test it?
>>
>> I'm assuming that Jane is using a program that calls sigaction to
>> install a SIGBUS handler, and dumps the entire siginfo_t structure
>> whenever it receives one...

Yes, thanks Darrick.

>
> OK.  Let me try it and figure out what's wrong in it.

I injected poison via "ndctl inject-error", not expecting it made any
difference though.

Any luck?

thanks,
-jane