LinuxLists.cc - A NFS, xfs, reflink and rmapbt story

2020-01-23 08:33:15

Subject: A NFS, xfs, reflink and rmapbt story

Hi,

Deleting the files left by generic/175 costs too much time when testing
on NFSv4.2 exporting xfs with rmapbt=1.

"./check -nfs generic/175 generic/176" should reproduce it.

My test bed is a 16c8G vm.

NFSv4.2 rmapbt=1 24h+
NFSv4.2 rmapbt=0 1h-2h
xfs rmapbt=1 10m+

At first I thought it hung, turns out it was just slow when deleting
2 massive reflined files.

It's reproducible using latest Linus tree, and Darrick's deferred-inactivation
branch. Run latest for-next branch xfsprogs.

I'm not sure it's something wrong, just sharing with you guys. I don't
remember I have identified this as a regression. It should be there for
a long time.

Sending to xfs and nfs because it looks like all related. :)

This almost gets lost in my list. Not much information recorded, some
trace-cmd outputs for your info. It's easy to reproduce. If it's
interesting to you and need any info, feel free to ask.

Thanks,

7) 0.279 us | xfs_btree_get_block [xfs]();
7) 0.303 us | xfs_btree_rec_offset [xfs]();
7) 0.301 us | xfs_rmapbt_init_high_key_from_rec [xfs]();
7) 0.356 us | xfs_rmapbt_diff_two_keys [xfs]();
7) 0.305 us | xfs_rmapbt_init_key_from_rec [xfs]();
7) 0.306 us | xfs_rmapbt_diff_two_keys [xfs]();
7) | xfs_rmap_query_range_helper [xfs]() {
7) 0.279 us | xfs_rmap_btrec_to_irec [xfs]();
7) | xfs_rmap_lookup_le_range_helper [xfs]() {
1) 0.786 us | _raw_spin_lock_irqsave();
7) | /* xfs_rmap_lookup_le_range_candidate: dev 8:34 agno 2 agbno 6416 len 256 owner 67160161 offset 99284480 flags 0x0 */
7) 0.506 us | }
7) 1.680 us | }

2020-01-24 01:18:43

by Darrick J. Wong

[permalink] [raw]

Subject: Re: A NFS, xfs, reflink and rmapbt story

On Thu, Jan 23, 2020 at 04:32:17PM +0800, Murphy Zhou wrote:
> Hi,
>
> Deleting the files left by generic/175 costs too much time when testing
> on NFSv4.2 exporting xfs with rmapbt=1.
>
> "./check -nfs generic/175 generic/176" should reproduce it.
>
> My test bed is a 16c8G vm.

What kind of storage?

> NFSv4.2 rmapbt=1 24h+

<URK> Wow. I wonder what about NFS makes us so slow now? Synchronous
transactions on the inactivation? (speculates wildly at the end of the
workday)

I'll have a look in the morning. It might take me a while to remember
how to set up NFS42 :)

--D

> NFSv4.2 rmapbt=0 1h-2h
> xfs rmapbt=1 10m+
>
> At first I thought it hung, turns out it was just slow when deleting
> 2 massive reflined files.
>
> It's reproducible using latest Linus tree, and Darrick's deferred-inactivation
> branch. Run latest for-next branch xfsprogs.
>
> I'm not sure it's something wrong, just sharing with you guys. I don't
> remember I have identified this as a regression. It should be there for
> a long time.
>
> Sending to xfs and nfs because it looks like all related. :)
>
> This almost gets lost in my list. Not much information recorded, some
> trace-cmd outputs for your info. It's easy to reproduce. If it's
> interesting to you and need any info, feel free to ask.
>
> Thanks,
>
>
> 7) 0.279 us | xfs_btree_get_block [xfs]();
> 7) 0.303 us | xfs_btree_rec_offset [xfs]();
> 7) 0.301 us | xfs_rmapbt_init_high_key_from_rec [xfs]();
> 7) 0.356 us | xfs_rmapbt_diff_two_keys [xfs]();
> 7) 0.305 us | xfs_rmapbt_init_key_from_rec [xfs]();
> 7) 0.306 us | xfs_rmapbt_diff_two_keys [xfs]();
> 7) | xfs_rmap_query_range_helper [xfs]() {
> 7) 0.279 us | xfs_rmap_btrec_to_irec [xfs]();
> 7) | xfs_rmap_lookup_le_range_helper [xfs]() {
> 1) 0.786 us | _raw_spin_lock_irqsave();
> 7) | /* xfs_rmap_lookup_le_range_candidate: dev 8:34 agno 2 agbno 6416 len 256 owner 67160161 offset 99284480 flags 0x0 */
> 7) 0.506 us | }
> 7) 1.680 us | }

2020-01-27 22:38:11

by J. Bruce Fields

[permalink] [raw]

Subject: Re: A NFS, xfs, reflink and rmapbt story

On Thu, Jan 23, 2020 at 05:10:19PM -0800, Darrick J. Wong wrote:
> On Thu, Jan 23, 2020 at 04:32:17PM +0800, Murphy Zhou wrote:
> > Hi,
> >
> > Deleting the files left by generic/175 costs too much time when testing
> > on NFSv4.2 exporting xfs with rmapbt=1.
> >
> > "./check -nfs generic/175 generic/176" should reproduce it.
> >
> > My test bed is a 16c8G vm.
>
> What kind of storage?
>
> > NFSv4.2 rmapbt=1 24h+
>
> <URK> Wow. I wonder what about NFS makes us so slow now? Synchronous
> transactions on the inactivation? (speculates wildly at the end of the
> workday)
>
> I'll have a look in the morning. It might take me a while to remember
> how to set up NFS42 :)

It may just be the default on a recent enough distro.

Though I'd be a little surprised if this behavior is specific to the
protocol version.

nfsd_unlink() is basically just vfs_unlink() followed by
commit_metadata().

--b.

>
> --D
>
> > NFSv4.2 rmapbt=0 1h-2h
> > xfs rmapbt=1 10m+
> >
> > At first I thought it hung, turns out it was just slow when deleting
> > 2 massive reflined files.
> >
> > It's reproducible using latest Linus tree, and Darrick's deferred-inactivation
> > branch. Run latest for-next branch xfsprogs.
> >
> > I'm not sure it's something wrong, just sharing with you guys. I don't
> > remember I have identified this as a regression. It should be there for
> > a long time.
> >
> > Sending to xfs and nfs because it looks like all related. :)
> >
> > This almost gets lost in my list. Not much information recorded, some
> > trace-cmd outputs for your info. It's easy to reproduce. If it's
> > interesting to you and need any info, feel free to ask.
> >
> > Thanks,
> >
> >
> > 7) 0.279 us | xfs_btree_get_block [xfs]();
> > 7) 0.303 us | xfs_btree_rec_offset [xfs]();
> > 7) 0.301 us | xfs_rmapbt_init_high_key_from_rec [xfs]();
> > 7) 0.356 us | xfs_rmapbt_diff_two_keys [xfs]();
> > 7) 0.305 us | xfs_rmapbt_init_key_from_rec [xfs]();
> > 7) 0.306 us | xfs_rmapbt_diff_two_keys [xfs]();
> > 7) | xfs_rmap_query_range_helper [xfs]() {
> > 7) 0.279 us | xfs_rmap_btrec_to_irec [xfs]();
> > 7) | xfs_rmap_lookup_le_range_helper [xfs]() {
> > 1) 0.786 us | _raw_spin_lock_irqsave();
> > 7) | /* xfs_rmap_lookup_le_range_candidate: dev 8:34 agno 2 agbno 6416 len 256 owner 67160161 offset 99284480 flags 0x0 */
> > 7) 0.506 us | }
> > 7) 1.680 us | }

2020-01-27 23:57:15

by Dave Chinner

[permalink] [raw]

Subject: Re: A NFS, xfs, reflink and rmapbt story

On Thu, Jan 23, 2020 at 05:10:19PM -0800, Darrick J. Wong wrote:
> On Thu, Jan 23, 2020 at 04:32:17PM +0800, Murphy Zhou wrote:
> > Hi,
> >
> > Deleting the files left by generic/175 costs too much time when testing
> > on NFSv4.2 exporting xfs with rmapbt=1.
> >
> > "./check -nfs generic/175 generic/176" should reproduce it.
> >
> > My test bed is a 16c8G vm.
>
> What kind of storage?

Is the NFS server the same machine as what the local XFS tests were
run on?

> > NFSv4.2 rmapbt=1 24h+
>
> <URK> Wow. I wonder what about NFS makes us so slow now? Synchronous
> transactions on the inactivation? (speculates wildly at the end of the
> workday)

Doubt it - NFS server uses ->commit_metadata after the async
operation to ensure that it is completed and on stable storage, so
the truncate on inactivation should run at pretty much the same
speed as on a local filesystem as it's still all async commits. i.e.
the only difference on the NFS server is the log force that follows
the inode inactivation...

> I'll have a look in the morning. It might take me a while to remember
> how to set up NFS42 :)
>
> --D
>
> > NFSv4.2 rmapbt=0 1h-2h
> > xfs rmapbt=1 10m+
> >
> > At first I thought it hung, turns out it was just slow when deleting
> > 2 massive reflined files.

Both tests run on the scratch device, so I don't see where there is
a large file unlink in either of these tests.

In which case, I'd expect that all the time is consumed in
generic/176 running punch_alternating to create a million extents
as that will effectively run a synchronous server-side hole punch
half a million times.

However, I'm guessing that the server side filesystem has a very
small log and is on spinning rust, hence the ->commit_metadata log
forces are preventing in-memory aggregation of modifications. This
results in the working set of metadata not fitting in the log and so
each new hole punch transaction ends up waiting on log tail pushing
(i.e. metadata writeback IO). i.e. it's thrashing the disk, and
that's why it is slow.....

Storage details, please!

Cheers,

Dave.
--
Dave Chinner
[email protected]

2020-02-05 06:24:10

by Murphy Zhou

[permalink] [raw]

Subject: Re: A NFS, xfs, reflink and rmapbt story

On Mon, Jan 27, 2020 at 05:36:31PM -0500, J. Bruce Fields wrote:
> On Thu, Jan 23, 2020 at 05:10:19PM -0800, Darrick J. Wong wrote:
> > On Thu, Jan 23, 2020 at 04:32:17PM +0800, Murphy Zhou wrote:
> > > Hi,
> > >
> > > Deleting the files left by generic/175 costs too much time when testing
> > > on NFSv4.2 exporting xfs with rmapbt=1.
> > >
> > > "./check -nfs generic/175 generic/176" should reproduce it.
> > >
> > > My test bed is a 16c8G vm.
> >
> > What kind of storage?
> >
> > > NFSv4.2 rmapbt=1 24h+
> >
> > <URK> Wow. I wonder what about NFS makes us so slow now? Synchronous
> > transactions on the inactivation? (speculates wildly at the end of the
> > workday)
> >
> > I'll have a look in the morning. It might take me a while to remember
> > how to set up NFS42 :)
>
> It may just be the default on a recent enough distro.
>
> Though I'd be a little surprised if this behavior is specific to the
> protocol version.

This testcase requires reflink, which is only available in v4.2.
On other protocols, this testase does not run.

Murphy

>
> nfsd_unlink() is basically just vfs_unlink() followed by
> commit_metadata().
>
> --b.
>
> >
> > --D
> >
> > > NFSv4.2 rmapbt=0 1h-2h
> > > xfs rmapbt=1 10m+
> > >
> > > At first I thought it hung, turns out it was just slow when deleting
> > > 2 massive reflined files.
> > >
> > > It's reproducible using latest Linus tree, and Darrick's deferred-inactivation
> > > branch. Run latest for-next branch xfsprogs.
> > >
> > > I'm not sure it's something wrong, just sharing with you guys. I don't
> > > remember I have identified this as a regression. It should be there for
> > > a long time.
> > >
> > > Sending to xfs and nfs because it looks like all related. :)
> > >
> > > This almost gets lost in my list. Not much information recorded, some
> > > trace-cmd outputs for your info. It's easy to reproduce. If it's
> > > interesting to you and need any info, feel free to ask.
> > >
> > > Thanks,
> > >
> > >
> > > 7) 0.279 us | xfs_btree_get_block [xfs]();
> > > 7) 0.303 us | xfs_btree_rec_offset [xfs]();
> > > 7) 0.301 us | xfs_rmapbt_init_high_key_from_rec [xfs]();
> > > 7) 0.356 us | xfs_rmapbt_diff_two_keys [xfs]();
> > > 7) 0.305 us | xfs_rmapbt_init_key_from_rec [xfs]();
> > > 7) 0.306 us | xfs_rmapbt_diff_two_keys [xfs]();
> > > 7) | xfs_rmap_query_range_helper [xfs]() {
> > > 7) 0.279 us | xfs_rmap_btrec_to_irec [xfs]();
> > > 7) | xfs_rmap_lookup_le_range_helper [xfs]() {
> > > 1) 0.786 us | _raw_spin_lock_irqsave();
> > > 7) | /* xfs_rmap_lookup_le_range_candidate: dev 8:34 agno 2 agbno 6416 len 256 owner 67160161 offset 99284480 flags 0x0 */
> > > 7) 0.506 us | }
> > > 7) 1.680 us | }

2020-02-05 06:53:55

by Murphy Zhou

[permalink] [raw]

Subject: Re: A NFS, xfs, reflink and rmapbt story

On Tue, Jan 28, 2020 at 10:56:17AM +1100, Dave Chinner wrote:
> On Thu, Jan 23, 2020 at 05:10:19PM -0800, Darrick J. Wong wrote:
> > On Thu, Jan 23, 2020 at 04:32:17PM +0800, Murphy Zhou wrote:
> > > Hi,
> > >
> > > Deleting the files left by generic/175 costs too much time when testing
> > > on NFSv4.2 exporting xfs with rmapbt=1.
> > >
> > > "./check -nfs generic/175 generic/176" should reproduce it.
> > >
> > > My test bed is a 16c8G vm.
> >
> > What kind of storage?

Loop device in guest.

# Host:

[root@ibm-x3850x5-03]$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 2.7T 0 disk
├─sda1 8:1 0 1M 0 part
├─sda2 8:2 0 1G 0 part /boot
└─sda3 8:3 0 2.7T 0 part
├─rhel_ibm--x3850x5--03-root 253:0 0 550G 0 lvm /
├─rhel_ibm--x3850x5--03-swap 253:1 0 27.6G 0 lvm [SWAP]
├─rhel_ibm--x3850x5--03-home 253:2 0 1.7T 0 lvm /home
├─rhel_ibm--x3850x5--03-test1 253:3 0 10G 0 lvm
└─rhel_ibm--x3850x5--03-test2 253:4 0 10G 0 lvm
loop0 7:0 0 1G 0 loop
loop1 7:1 0 1G 0 loop
[root@ibm-x3850x5-03]$ smartctl -a /dev/sda
smartctl 7.0 2018-12-30 r4883 [x86_64-linux-3.10.0-1115.el7.x86_64]
(local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke,
http://www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor: IBM
Product: ServeRAID M5015
Revision: 2.13
Compliance: SPC-3
User Capacity: 2,996,997,980,160 bytes [2.99 TB]
Logical block size: 512 bytes
Logical Unit id: 0x600605b001665aa019cb17be1e9ce991
Serial number: 0091e99c1ebe17cb19a05a6601b00506
Device type: disk
Local Time is: Wed Feb 5 14:35:57 2020 CST
SMART support is: Unavailable - device lacks SMART capability.

=== START OF READ SMART DATA SECTION ===
Current Drive Temperature: 0 C
Drive Trip Temperature: 0 C

Error Counter logging not supported

Device does not support Self Test logging
[root@ibm-x3850x5-03]$ virsh domblklist 8u
Target Source
------------------------------------------------
hda /home/8u.qcow2
hdb /home/8ut.qcow2
hdc /home/8ut1.qcow2

[root@ibm-x3850x5-03]$

# Guest:

[root@8u]$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 800G 0 disk
├─sda1 8:1 0 2G 0 part
│ └─rhel-swap 253:0 0 2G 0 lvm [SWAP]
└─sda2 8:2 0 798G 0 part /
sdb 8:16 0 200G 0 disk /home
sdc 8:32 0 100G 0 disk
├─sdc1 8:33 0 50G 0 part
└─sdc2 8:34 0 50G 0 part
pmem0 259:0 0 5G 0 disk
[root@8u]$ smartctl -a /dev/sdb
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-5.5.0-v5.5-9386-g33b4013]
(local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke,
http://www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model: QEMU HARDDISK
Serial Number: QM00003
Firmware Version: 1.5.3
User Capacity: 214,748,364,800 bytes [214 GB]
Sector Size: 512 bytes logical/physical
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ATA/ATAPI-7, ATA/ATAPI-5 published, ANSI NCITS
340-2000
Local Time is: Wed Feb 5 14:39:18 2020 CST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection:
Enabled.
Self-test execution status: ( 0) The previous self-test routine
completed
without error or no self-test
has ever
been run.
Total time to complete Offline
data collection: ( 288) seconds.
Offline data collection
capabilities: (0x19) SMART execute Offline immediate.
No Auto Offline data collection
support.
Suspend Offline collection upon
new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test
supported.
No Selective Self-test
supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
No General Purpose Logging
support.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 54) minutes.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x0003 100 100 006 Pre-fail Always
- 0
3 Spin_Up_Time 0x0003 100 100 000 Pre-fail Always
- 16
4 Start_Stop_Count 0x0002 100 100 020 Old_age Always
- 100
5 Reallocated_Sector_Ct 0x0003 100 100 036 Pre-fail Always
- 0
9 Power_On_Hours 0x0003 100 100 000 Pre-fail Always
- 1
12 Power_Cycle_Count 0x0003 100 100 000 Pre-fail Always
- 0
190 Airflow_Temperature_Cel 0x0003 069 069 050 Pre-fail Always
- 31 (Min/Max 31/31)

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]

Selective Self-tests/Logging not supported

[root@8u]$

>
> Is the NFS server the same machine as what the local XFS tests were
> run on?

Yes. It's also reproducible whening testing on remote NFS mounts.

>
> > > NFSv4.2 rmapbt=1 24h+
> >
> > <URK> Wow. I wonder what about NFS makes us so slow now? Synchronous
> > transactions on the inactivation? (speculates wildly at the end of the
> > workday)
>
> Doubt it - NFS server uses ->commit_metadata after the async
> operation to ensure that it is completed and on stable storage, so
> the truncate on inactivation should run at pretty much the same
> speed as on a local filesystem as it's still all async commits. i.e.
> the only difference on the NFS server is the log force that follows
> the inode inactivation...
>
> > I'll have a look in the morning. It might take me a while to remember
> > how to set up NFS42 :)
> >
> > --D
> >
> > > NFSv4.2 rmapbt=0 1h-2h
> > > xfs rmapbt=1 10m+
> > >
> > > At first I thought it hung, turns out it was just slow when deleting
> > > 2 massive reflined files.
>
> Both tests run on the scratch device, so I don't see where there is
> a large file unlink in either of these tests.
>
> In which case, I'd expect that all the time is consumed in
> generic/176 running punch_alternating to create a million extents
> as that will effectively run a synchronous server-side hole punch
> half a million times.

I've tracked this down. Time was consumed in "rm -rf" in _scratch_mkfs
of generic/176. Thread https://www.spinics.net/lists/fstests/msg13316.html

Thanks,
Murphy

>
> However, I'm guessing that the server side filesystem has a very
> small log and is on spinning rust, hence the ->commit_metadata log
> forces are preventing in-memory aggregation of modifications. This
> results in the working set of metadata not fitting in the log and so
> each new hole punch transaction ends up waiting on log tail pushing
> (i.e. metadata writeback IO). i.e. it's thrashing the disk, and
> that's why it is slow.....
>
> Storage details, please!
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> [email protected]

2020-02-16 08:29:31

by Murphy Zhou

[permalink] [raw]

Subject: Re: A NFS, xfs, reflink and rmapbt story

Hi Bruce,

On Mon, Jan 27, 2020 at 05:36:31PM -0500, J. Bruce Fields wrote:
> On Thu, Jan 23, 2020 at 05:10:19PM -0800, Darrick J. Wong wrote:
> > On Thu, Jan 23, 2020 at 04:32:17PM +0800, Murphy Zhou wrote:
> > > Hi,
> > >
> > > Deleting the files left by generic/175 costs too much time when testing
> > > on NFSv4.2 exporting xfs with rmapbt=1.
> > >
> > > "./check -nfs generic/175 generic/176" should reproduce it.
> > >
> > > My test bed is a 16c8G vm.
> >
> > What kind of storage?
> >
> > > NFSv4.2 rmapbt=1 24h+
> >
> > <URK> Wow. I wonder what about NFS makes us so slow now? Synchronous
> > transactions on the inactivation? (speculates wildly at the end of the
> > workday)
> >
> > I'll have a look in the morning. It might take me a while to remember
> > how to set up NFS42 :)
>
> It may just be the default on a recent enough distro.
>
> Though I'd be a little surprised if this behavior is specific to the
> protocol version.

Can NFS client or server know the file has reflinked part ? Is there
any thing like a flag or a bit tracking this?

Thanks!
Murphy
>
> nfsd_unlink() is basically just vfs_unlink() followed by
> commit_metadata().
>
> --b.
>
> >
> > --D
> >
> > > NFSv4.2 rmapbt=0 1h-2h
> > > xfs rmapbt=1 10m+
> > >
> > > At first I thought it hung, turns out it was just slow when deleting
> > > 2 massive reflined files.
> > >
> > > It's reproducible using latest Linus tree, and Darrick's deferred-inactivation
> > > branch. Run latest for-next branch xfsprogs.
> > >
> > > I'm not sure it's something wrong, just sharing with you guys. I don't
> > > remember I have identified this as a regression. It should be there for
> > > a long time.
> > >
> > > Sending to xfs and nfs because it looks like all related. :)
> > >
> > > This almost gets lost in my list. Not much information recorded, some
> > > trace-cmd outputs for your info. It's easy to reproduce. If it's
> > > interesting to you and need any info, feel free to ask.
> > >
> > > Thanks,
> > >
> > >
> > > 7) 0.279 us | xfs_btree_get_block [xfs]();
> > > 7) 0.303 us | xfs_btree_rec_offset [xfs]();
> > > 7) 0.301 us | xfs_rmapbt_init_high_key_from_rec [xfs]();
> > > 7) 0.356 us | xfs_rmapbt_diff_two_keys [xfs]();
> > > 7) 0.305 us | xfs_rmapbt_init_key_from_rec [xfs]();
> > > 7) 0.306 us | xfs_rmapbt_diff_two_keys [xfs]();
> > > 7) | xfs_rmap_query_range_helper [xfs]() {
> > > 7) 0.279 us | xfs_rmap_btrec_to_irec [xfs]();
> > > 7) | xfs_rmap_lookup_le_range_helper [xfs]() {
> > > 1) 0.786 us | _raw_spin_lock_irqsave();
> > > 7) | /* xfs_rmap_lookup_le_range_candidate: dev 8:34 agno 2 agbno 6416 len 256 owner 67160161 offset 99284480 flags 0x0 */
> > > 7) 0.506 us | }
> > > 7) 1.680 us | }

2020-02-17 00:37:43

by J. Bruce Fields

[permalink] [raw]

Subject: Re: A NFS, xfs, reflink and rmapbt story

On Sun, Feb 16, 2020 at 04:28:51PM +0800, Murphy Zhou wrote:
> Hi Bruce,
>
> On Mon, Jan 27, 2020 at 05:36:31PM -0500, J. Bruce Fields wrote:
> > On Thu, Jan 23, 2020 at 05:10:19PM -0800, Darrick J. Wong wrote:
> > > On Thu, Jan 23, 2020 at 04:32:17PM +0800, Murphy Zhou wrote:
> > > > Hi,
> > > >
> > > > Deleting the files left by generic/175 costs too much time when testing
> > > > on NFSv4.2 exporting xfs with rmapbt=1.
> > > >
> > > > "./check -nfs generic/175 generic/176" should reproduce it.
> > > >
> > > > My test bed is a 16c8G vm.
> > >
> > > What kind of storage?
> > >
> > > > NFSv4.2 rmapbt=1 24h+
> > >
> > > <URK> Wow. I wonder what about NFS makes us so slow now? Synchronous
> > > transactions on the inactivation? (speculates wildly at the end of the
> > > workday)
> > >
> > > I'll have a look in the morning. It might take me a while to remember
> > > how to set up NFS42 :)
> >
> > It may just be the default on a recent enough distro.
> >
> > Though I'd be a little surprised if this behavior is specific to the
> > protocol version.
>
> Can NFS client or server know the file has reflinked part ? Is there
> any thing like a flag or a bit tracking this?

Not that I'm aware of.

--b.