From: Daeho Jeong <[email protected]>
When we print out a discontinuous compression chunk, it shows like a
continuous chunk now. To show it more correctly, I've changed the way of
printing fiemap info like below. Plus, eliminated NEW_ADDR(-1) in fiemap
info, since it is not in fiemap user api manual.
0: 0000000000000000 0000000000000000 0000000000001000 1008 (M/E)
1: 0000000000001000 0000000f15c0f000 0000000000001000 1008 (M/E)
2: 0000000000002000 0000000000000000 0000000000002000 1808 (M/U/E)
3: 0000000000004000 0000000000000000 0000000000001000 1008 (M/E)
4: 0000000000005000 0000000f15c10000 0000000000001000 1008 (M/E)
5: 0000000000006000 0000000000000000 0000000000002000 1808 (M/U/E)
6: 0000000000008000 0000000000000000 0000000000001000 1008 (M/E)
M => FIEMAP_EXTENT_MERGED
E => FIEMAP_EXTENT_ENCODED
U => FIEMAP_EXTENT_UNWRITTEN
Signed-off-by: Daeho Jeong <[email protected]>
---
fs/f2fs/data.c | 42 ++++++++++++++++--------------------------
1 file changed, 16 insertions(+), 26 deletions(-)
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 3a01a1b50104..6e1be876d96b 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1845,6 +1845,7 @@ int f2fs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
int ret = 0;
bool compr_cluster = false;
unsigned int cluster_size = F2FS_I(inode)->i_cluster_size;
+ unsigned int count_in_cluster;
loff_t maxbytes;
if (fieinfo->fi_flags & FIEMAP_FLAG_CACHE) {
@@ -1893,7 +1894,7 @@ int f2fs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
map.m_seg_type = NO_CHECK_TYPE;
if (compr_cluster)
- map.m_len = cluster_size - 1;
+ map.m_len = cluster_size - count_in_cluster;
ret = f2fs_map_blocks(inode, &map, 0, F2FS_GET_BLOCK_FIEMAP);
if (ret)
@@ -1926,37 +1927,26 @@ int f2fs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
if (start_blk > last_blk)
goto out;
- if (compr_cluster) {
- compr_cluster = false;
-
-
- logical = blks_to_bytes(inode, start_blk - 1);
- phys = blks_to_bytes(inode, map.m_pblk);
- size = blks_to_bytes(inode, cluster_size);
-
- flags |= FIEMAP_EXTENT_ENCODED;
-
- start_blk += cluster_size - 1;
-
- if (start_blk > last_blk)
- goto out;
-
- goto prep_next;
- }
-
- if (map.m_pblk == COMPRESS_ADDR) {
- compr_cluster = true;
- start_blk++;
- goto prep_next;
- }
-
logical = blks_to_bytes(inode, start_blk);
- phys = blks_to_bytes(inode, map.m_pblk);
+ phys = __is_valid_data_blkaddr(map.m_pblk) ?
+ blks_to_bytes(inode, map.m_pblk) : 0;
size = blks_to_bytes(inode, map.m_len);
flags = 0;
if (map.m_flags & F2FS_MAP_UNWRITTEN)
flags = FIEMAP_EXTENT_UNWRITTEN;
+ if (map.m_pblk == COMPRESS_ADDR) {
+ flags |= FIEMAP_EXTENT_ENCODED;
+ compr_cluster = true;
+ count_in_cluster = 1;
+ } else if (compr_cluster) {
+ flags |= FIEMAP_EXTENT_ENCODED;
+ if (map.m_len > 0)
+ count_in_cluster += map.m_len;
+ if (count_in_cluster == cluster_size)
+ compr_cluster = false;
+ }
+
start_blk += bytes_to_blks(inode, size);
prep_next:
--
2.32.0.402.g57bb445576-goog
On Wed, Jul 21, 2021 at 12:20:48AM -0700, Daeho Jeong wrote:
> From: Daeho Jeong <[email protected]>
>
> When we print out a discontinuous compression chunk, it shows like a
> continuous chunk now. To show it more correctly, I've changed the way of
> printing fiemap info like below. Plus, eliminated NEW_ADDR(-1) in fiemap
> info, since it is not in fiemap user api manual.
>
> 0: 0000000000000000 0000000000000000 0000000000001000 1008 (M/E)
> 1: 0000000000001000 0000000f15c0f000 0000000000001000 1008 (M/E)
> 2: 0000000000002000 0000000000000000 0000000000002000 1808 (M/U/E)
> 3: 0000000000004000 0000000000000000 0000000000001000 1008 (M/E)
> 4: 0000000000005000 0000000f15c10000 0000000000001000 1008 (M/E)
> 5: 0000000000006000 0000000000000000 0000000000002000 1808 (M/U/E)
> 6: 0000000000008000 0000000000000000 0000000000001000 1008 (M/E)
Please label these columns.
Anyway, this doesn't appear to work quite in the way I had in mind. With this
patch, what I'm seeing is:
$ head -c 16384 /dev/zero > file; xfs_io -c "fiemap -v" file
file:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..7]: 0..7 8 0x1008
1: [8..15]: 2683128..2683135 8 0x1008
2: [16..31]: 0..15 16 0x1809
So, working in 512-byte sectors, the logical sectors 0-31 are stored as one
compressed cluster in the 8 physical sectors 2683128-2683135.
The problem is, with this patch these physical sectors are reported at logical
sectors 8-15 instead of 0-7. Obviously, this isn't particularly well-defined,
but I thought it was logical for the physical blocks to be associated with the
first logical blocks. That is what the tests I've written (xfstest f2fs/002,
and the Android vts_kernel_encryption_test) assume.
Is there any particular reason why you wouldn't report instead:
0: [0..7]: 2683128..2683135 8 0x1008
1: [8..31]: 0..23 8 0x1809
- Eric
Hi Eric,
On Wed, Jul 21, 2021 at 2:35 PM Eric Biggers <[email protected]> wrote:
>
> On Wed, Jul 21, 2021 at 12:20:48AM -0700, Daeho Jeong wrote:
> > From: Daeho Jeong <[email protected]>
> >
> > When we print out a discontinuous compression chunk, it shows like a
> > continuous chunk now. To show it more correctly, I've changed the way of
> > printing fiemap info like below. Plus, eliminated NEW_ADDR(-1) in fiemap
> > info, since it is not in fiemap user api manual.
> >
> > 0: 0000000000000000 0000000000000000 0000000000001000 1008 (M/E)
> > 1: 0000000000001000 0000000f15c0f000 0000000000001000 1008 (M/E)
> > 2: 0000000000002000 0000000000000000 0000000000002000 1808 (M/U/E)
> > 3: 0000000000004000 0000000000000000 0000000000001000 1008 (M/E)
> > 4: 0000000000005000 0000000f15c10000 0000000000001000 1008 (M/E)
> > 5: 0000000000006000 0000000000000000 0000000000002000 1808 (M/U/E)
> > 6: 0000000000008000 0000000000000000 0000000000001000 1008 (M/E)
>
> Please label these columns.
>
> Anyway, this doesn't appear to work quite in the way I had in mind. With this
> patch, what I'm seeing is:
>
> $ head -c 16384 /dev/zero > file; xfs_io -c "fiemap -v" file
> file:
> EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
> 0: [0..7]: 0..7 8 0x1008
> 1: [8..15]: 2683128..2683135 8 0x1008
> 2: [16..31]: 0..15 16 0x1809
>
> So, working in 512-byte sectors, the logical sectors 0-31 are stored as one
> compressed cluster in the 8 physical sectors 2683128-2683135.
>
> The problem is, with this patch these physical sectors are reported at logical
> sectors 8-15 instead of 0-7. Obviously, this isn't particularly well-defined,
> but I thought it was logical for the physical blocks to be associated with the
> first logical blocks. That is what the tests I've written (xfstest f2fs/002,
> and the Android vts_kernel_encryption_test) assume.
>
> Is there any particular reason why you wouldn't report instead:
>
> 0: [0..7]: 2683128..2683135 8 0x1008
> 1: [8..31]: 0..23 8 0x1809
>
> - Eric
The reason is related to how F2FS stores the mapping information in
the mapping table.
Actually, the mapping inside of the table is like this.
[0..7]: COMPR_ADDR flag(0x1008) -> merged, encoded
[8..15]: 2683128..2683135 flag(0x1008) -> merged, encoded
[16..31]: NEW_ADDR flag(0x1809) -> merged, unwritten(!), last_extent
I understand what you mean.
But, if we adapt to your way, how do you think we can print out when
we ask for f2fs to print out only the [8..15] area? Zero address? How
about flags?
I think the current way explains the layout of the f2fs metadata more exactly.
Thank you,
On Wed, Jul 21, 2021 at 03:30:46PM -0700, Daeho Jeong wrote:
> Hi Eric,
>
> On Wed, Jul 21, 2021 at 2:35 PM Eric Biggers <[email protected]> wrote:
> >
> > On Wed, Jul 21, 2021 at 12:20:48AM -0700, Daeho Jeong wrote:
> > > From: Daeho Jeong <[email protected]>
> > >
> > > When we print out a discontinuous compression chunk, it shows like a
> > > continuous chunk now. To show it more correctly, I've changed the way of
> > > printing fiemap info like below. Plus, eliminated NEW_ADDR(-1) in fiemap
> > > info, since it is not in fiemap user api manual.
> > >
> > > 0: 0000000000000000 0000000000000000 0000000000001000 1008 (M/E)
> > > 1: 0000000000001000 0000000f15c0f000 0000000000001000 1008 (M/E)
> > > 2: 0000000000002000 0000000000000000 0000000000002000 1808 (M/U/E)
> > > 3: 0000000000004000 0000000000000000 0000000000001000 1008 (M/E)
> > > 4: 0000000000005000 0000000f15c10000 0000000000001000 1008 (M/E)
> > > 5: 0000000000006000 0000000000000000 0000000000002000 1808 (M/U/E)
> > > 6: 0000000000008000 0000000000000000 0000000000001000 1008 (M/E)
> >
> > Please label these columns.
> >
> > Anyway, this doesn't appear to work quite in the way I had in mind. With this
> > patch, what I'm seeing is:
> >
> > $ head -c 16384 /dev/zero > file; xfs_io -c "fiemap -v" file
> > file:
> > EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
> > 0: [0..7]: 0..7 8 0x1008
> > 1: [8..15]: 2683128..2683135 8 0x1008
> > 2: [16..31]: 0..15 16 0x1809
> >
> > So, working in 512-byte sectors, the logical sectors 0-31 are stored as one
> > compressed cluster in the 8 physical sectors 2683128-2683135.
> >
> > The problem is, with this patch these physical sectors are reported at logical
> > sectors 8-15 instead of 0-7. Obviously, this isn't particularly well-defined,
> > but I thought it was logical for the physical blocks to be associated with the
> > first logical blocks. That is what the tests I've written (xfstest f2fs/002,
> > and the Android vts_kernel_encryption_test) assume.
> >
> > Is there any particular reason why you wouldn't report instead:
> >
> > 0: [0..7]: 2683128..2683135 8 0x1008
> > 1: [8..31]: 0..23 8 0x1809
> >
> > - Eric
>
> The reason is related to how F2FS stores the mapping information in
> the mapping table.
> Actually, the mapping inside of the table is like this.
> [0..7]: COMPR_ADDR flag(0x1008) -> merged, encoded
> [8..15]: 2683128..2683135 flag(0x1008) -> merged, encoded
> [16..31]: NEW_ADDR flag(0x1809) -> merged, unwritten(!), last_extent
>
> I understand what you mean.
> But, if we adapt to your way, how do you think we can print out when
> we ask for f2fs to print out only the [8..15] area? Zero address? How
> about flags?
> I think the current way explains the layout of the f2fs metadata more exactly.
>
How f2fs stores the mapping information doesn't matter. That's an
implementation detail that shouldn't be exposed to userspace. The only thing
that should be exposed is the actual mapping, and for that it seems natural to
report the physical blocks first.
There is no perfect solution for how to handle the remaining logical blocks,
given that the fiemap API was not designed for compressed files, but I think we
should just go with extending the length of the last compressed extent in the
cluster to cover the remaining logical blocks, i.e.:
[0..31]: 2683128..2683159 flag(0x1009) -> merged, encoded, last_extent
That's what btrfs does on compressed files.
- Eric
On Wed, Jul 21, 2021 at 5:15 PM Eric Biggers <[email protected]> wrote:
>
> On Wed, Jul 21, 2021 at 03:30:46PM -0700, Daeho Jeong wrote:
> > Hi Eric,
> >
> > On Wed, Jul 21, 2021 at 2:35 PM Eric Biggers <[email protected]> wrote:
> > >
> > > On Wed, Jul 21, 2021 at 12:20:48AM -0700, Daeho Jeong wrote:
> > > > From: Daeho Jeong <[email protected]>
> > > >
> > > > When we print out a discontinuous compression chunk, it shows like a
> > > > continuous chunk now. To show it more correctly, I've changed the way of
> > > > printing fiemap info like below. Plus, eliminated NEW_ADDR(-1) in fiemap
> > > > info, since it is not in fiemap user api manual.
> > > >
> > > > 0: 0000000000000000 0000000000000000 0000000000001000 1008 (M/E)
> > > > 1: 0000000000001000 0000000f15c0f000 0000000000001000 1008 (M/E)
> > > > 2: 0000000000002000 0000000000000000 0000000000002000 1808 (M/U/E)
> > > > 3: 0000000000004000 0000000000000000 0000000000001000 1008 (M/E)
> > > > 4: 0000000000005000 0000000f15c10000 0000000000001000 1008 (M/E)
> > > > 5: 0000000000006000 0000000000000000 0000000000002000 1808 (M/U/E)
> > > > 6: 0000000000008000 0000000000000000 0000000000001000 1008 (M/E)
> > >
> > > Please label these columns.
> > >
> > > Anyway, this doesn't appear to work quite in the way I had in mind. With this
> > > patch, what I'm seeing is:
> > >
> > > $ head -c 16384 /dev/zero > file; xfs_io -c "fiemap -v" file
> > > file:
> > > EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
> > > 0: [0..7]: 0..7 8 0x1008
> > > 1: [8..15]: 2683128..2683135 8 0x1008
> > > 2: [16..31]: 0..15 16 0x1809
> > >
> > > So, working in 512-byte sectors, the logical sectors 0-31 are stored as one
> > > compressed cluster in the 8 physical sectors 2683128-2683135.
> > >
> > > The problem is, with this patch these physical sectors are reported at logical
> > > sectors 8-15 instead of 0-7. Obviously, this isn't particularly well-defined,
> > > but I thought it was logical for the physical blocks to be associated with the
> > > first logical blocks. That is what the tests I've written (xfstest f2fs/002,
> > > and the Android vts_kernel_encryption_test) assume.
> > >
> > > Is there any particular reason why you wouldn't report instead:
> > >
> > > 0: [0..7]: 2683128..2683135 8 0x1008
> > > 1: [8..31]: 0..23 8 0x1809
> > >
> > > - Eric
> >
> > The reason is related to how F2FS stores the mapping information in
> > the mapping table.
> > Actually, the mapping inside of the table is like this.
> > [0..7]: COMPR_ADDR flag(0x1008) -> merged, encoded
> > [8..15]: 2683128..2683135 flag(0x1008) -> merged, encoded
> > [16..31]: NEW_ADDR flag(0x1809) -> merged, unwritten(!), last_extent
> >
> > I understand what you mean.
> > But, if we adapt to your way, how do you think we can print out when
> > we ask for f2fs to print out only the [8..15] area? Zero address? How
> > about flags?
> > I think the current way explains the layout of the f2fs metadata more exactly.
> >
>
> How f2fs stores the mapping information doesn't matter. That's an
> implementation detail that shouldn't be exposed to userspace. The only thing
> that should be exposed is the actual mapping, and for that it seems natural to
> report the physical blocks first.
>
> There is no perfect solution for how to handle the remaining logical blocks,
> given that the fiemap API was not designed for compressed files, but I think we
> should just go with extending the length of the last compressed extent in the
> cluster to cover the remaining logical blocks, i.e.:
>
> [0..31]: 2683128..2683159 flag(0x1009) -> merged, encoded, last_extent
>
> That's what btrfs does on compressed files.
>
> - Eric
I also agree that that's an implementation detail that shouldn't be
exposed to userspace.
I want to make it more clear for better appearance.
Do you think we have to remove "unwritten" information below? I also
think it might be unnecessary information for the user.
[0..31]: 2683128..2683159 flag(0x1009) -> merged, encoded, last_extent
(unwritten?)
Do you want f2fs to print out the info on a cluster basis, even when
the user asks for one block information?
Like
If the user asks for the info of [8..15], f2fs will return the info of [0..31]?
Thank you,
On Wed, Jul 21, 2021 at 06:04:22PM -0700, Daeho Jeong wrote:
> >
> > How f2fs stores the mapping information doesn't matter. That's an
> > implementation detail that shouldn't be exposed to userspace. The only thing
> > that should be exposed is the actual mapping, and for that it seems natural to
> > report the physical blocks first.
> >
> > There is no perfect solution for how to handle the remaining logical blocks,
> > given that the fiemap API was not designed for compressed files, but I think we
> > should just go with extending the length of the last compressed extent in the
> > cluster to cover the remaining logical blocks, i.e.:
> >
> > [0..31]: 2683128..2683159 flag(0x1009) -> merged, encoded, last_extent
> >
> > That's what btrfs does on compressed files.
> >
> > - Eric
>
> I also agree that that's an implementation detail that shouldn't be
> exposed to userspace.
>
> I want to make it more clear for better appearance.
>
> Do you think we have to remove "unwritten" information below? I also
> think it might be unnecessary information for the user.
> [0..31]: 2683128..2683159 flag(0x1009) -> merged, encoded, last_extent
> (unwritten?)
FIEMAP_EXTENT_UNWRITTEN already has a specific meaning; see
Documentation/filesystems/fiemap.rst. It means that the data is all zeroes, and
the disk space is preallocated but the data hasn't been written to disk yet.
In this case, the data is *not* necessarily all zeroes. So I think
FIEMAP_EXTENT_UNWRITTEN shouldn't be used here.
> Do you want f2fs to print out the info on a cluster basis, even when
> the user asks for one block information?
> Like
> If the user asks for the info of [8..15], f2fs will return the info of [0..31]?
Yes, since that's how FS_IOC_FIEMAP is supposed to work; see
Documentation/filesystems/fiemap.rst:
All offsets and lengths are in bytes and mirror those on disk. It is
valid for an extents logical offset to start before the request or its
logical length to extend past the request.
(That being said, the f2fs compression+encryption tests I've written don't
exercise this case; they only map the whole file at once.)
- Eric
On Wed, Jul 21, 2021 at 6:15 PM Eric Biggers <[email protected]> wrote:
>
> On Wed, Jul 21, 2021 at 06:04:22PM -0700, Daeho Jeong wrote:
> > >
> > > How f2fs stores the mapping information doesn't matter. That's an
> > > implementation detail that shouldn't be exposed to userspace. The only thing
> > > that should be exposed is the actual mapping, and for that it seems natural to
> > > report the physical blocks first.
> > >
> > > There is no perfect solution for how to handle the remaining logical blocks,
> > > given that the fiemap API was not designed for compressed files, but I think we
> > > should just go with extending the length of the last compressed extent in the
> > > cluster to cover the remaining logical blocks, i.e.:
> > >
> > > [0..31]: 2683128..2683159 flag(0x1009) -> merged, encoded, last_extent
> > >
> > > That's what btrfs does on compressed files.
> > >
> > > - Eric
> >
> > I also agree that that's an implementation detail that shouldn't be
> > exposed to userspace.
> >
> > I want to make it more clear for better appearance.
> >
> > Do you think we have to remove "unwritten" information below? I also
> > think it might be unnecessary information for the user.
> > [0..31]: 2683128..2683159 flag(0x1009) -> merged, encoded, last_extent
> > (unwritten?)
>
> FIEMAP_EXTENT_UNWRITTEN already has a specific meaning; see
> Documentation/filesystems/fiemap.rst. It means that the data is all zeroes, and
> the disk space is preallocated but the data hasn't been written to disk yet.
>
> In this case, the data is *not* necessarily all zeroes. So I think
> FIEMAP_EXTENT_UNWRITTEN shouldn't be used here.
>
> > Do you want f2fs to print out the info on a cluster basis, even when
> > the user asks for one block information?
> > Like
> > If the user asks for the info of [8..15], f2fs will return the info of [0..31]?
>
> Yes, since that's how FS_IOC_FIEMAP is supposed to work; see
> Documentation/filesystems/fiemap.rst:
>
> All offsets and lengths are in bytes and mirror those on disk. It is
> valid for an extents logical offset to start before the request or its
> logical length to extend past the request.
>
> (That being said, the f2fs compression+encryption tests I've written don't
> exercise this case; they only map the whole file at once.)
>
> - Eric
My last question is.
How about a discontinuous cluster like [0..31] maps to discontinuous
three blocks like physical address 0x4, 0x14 and 0x24.
I think we have to return three extents for the one logical region
like the below. What do you think?
[0..31] -> 0x4 (merged, encoded)
[0..31] -> 0x14 (merged, encoded)
[0..31] -> 0x24 (merged, encoded, last_extent)
On Wed, Jul 21, 2021 at 06:40:00PM -0700, Daeho Jeong wrote:
> On Wed, Jul 21, 2021 at 6:15 PM Eric Biggers <[email protected]> wrote:
> >
> > On Wed, Jul 21, 2021 at 06:04:22PM -0700, Daeho Jeong wrote:
> > > >
> > > > How f2fs stores the mapping information doesn't matter. That's an
> > > > implementation detail that shouldn't be exposed to userspace. The only thing
> > > > that should be exposed is the actual mapping, and for that it seems natural to
> > > > report the physical blocks first.
> > > >
> > > > There is no perfect solution for how to handle the remaining logical blocks,
> > > > given that the fiemap API was not designed for compressed files, but I think we
> > > > should just go with extending the length of the last compressed extent in the
> > > > cluster to cover the remaining logical blocks, i.e.:
> > > >
> > > > [0..31]: 2683128..2683159 flag(0x1009) -> merged, encoded, last_extent
> > > >
> > > > That's what btrfs does on compressed files.
> > > >
> > > > - Eric
> > >
> > > I also agree that that's an implementation detail that shouldn't be
> > > exposed to userspace.
> > >
> > > I want to make it more clear for better appearance.
> > >
> > > Do you think we have to remove "unwritten" information below? I also
> > > think it might be unnecessary information for the user.
> > > [0..31]: 2683128..2683159 flag(0x1009) -> merged, encoded, last_extent
> > > (unwritten?)
> >
> > FIEMAP_EXTENT_UNWRITTEN already has a specific meaning; see
> > Documentation/filesystems/fiemap.rst. It means that the data is all zeroes, and
> > the disk space is preallocated but the data hasn't been written to disk yet.
> >
> > In this case, the data is *not* necessarily all zeroes. So I think
> > FIEMAP_EXTENT_UNWRITTEN shouldn't be used here.
> >
> > > Do you want f2fs to print out the info on a cluster basis, even when
> > > the user asks for one block information?
> > > Like
> > > If the user asks for the info of [8..15], f2fs will return the info of [0..31]?
> >
> > Yes, since that's how FS_IOC_FIEMAP is supposed to work; see
> > Documentation/filesystems/fiemap.rst:
> >
> > All offsets and lengths are in bytes and mirror those on disk. It is
> > valid for an extents logical offset to start before the request or its
> > logical length to extend past the request.
> >
> > (That being said, the f2fs compression+encryption tests I've written don't
> > exercise this case; they only map the whole file at once.)
> >
> > - Eric
>
> My last question is.
> How about a discontinuous cluster like [0..31] maps to discontinuous
> three blocks like physical address 0x4, 0x14 and 0x24.
> I think we have to return three extents for the one logical region
> like the below. What do you think?
> [0..31] -> 0x4 (merged, encoded)
> [0..31] -> 0x14 (merged, encoded)
> [0..31] -> 0x24 (merged, encoded, last_extent)
No, please don't do that. struct fiemap_extent only has a single length field,
not separate lengths for fe_logical and fe_physical, so with your proposal there
would be no way to know how many physical blocks to take from each extent. It
would be reporting the same part of the file in contradictory ways.
Like I suggested originally, I think this case should be reported like:
fe_logical=0 fe_physical=16384 length=4096
fe_logical=4096 fe_physical=81920 length=4096
fe_logical=8192 fe_physical=147456 length=8192
It's not perfect, but I think it's the least bad option, for the reasons I've
explained previously...
- Eric
On Wed, Jul 21, 2021 at 6:56 PM Eric Biggers <[email protected]> wrote:
>
> On Wed, Jul 21, 2021 at 06:40:00PM -0700, Daeho Jeong wrote:
> > On Wed, Jul 21, 2021 at 6:15 PM Eric Biggers <[email protected]> wrote:
> > >
> > > On Wed, Jul 21, 2021 at 06:04:22PM -0700, Daeho Jeong wrote:
> > > > >
> > > > > How f2fs stores the mapping information doesn't matter. That's an
> > > > > implementation detail that shouldn't be exposed to userspace. The only thing
> > > > > that should be exposed is the actual mapping, and for that it seems natural to
> > > > > report the physical blocks first.
> > > > >
> > > > > There is no perfect solution for how to handle the remaining logical blocks,
> > > > > given that the fiemap API was not designed for compressed files, but I think we
> > > > > should just go with extending the length of the last compressed extent in the
> > > > > cluster to cover the remaining logical blocks, i.e.:
> > > > >
> > > > > [0..31]: 2683128..2683159 flag(0x1009) -> merged, encoded, last_extent
> > > > >
> > > > > That's what btrfs does on compressed files.
> > > > >
> > > > > - Eric
> > > >
> > > > I also agree that that's an implementation detail that shouldn't be
> > > > exposed to userspace.
> > > >
> > > > I want to make it more clear for better appearance.
> > > >
> > > > Do you think we have to remove "unwritten" information below? I also
> > > > think it might be unnecessary information for the user.
> > > > [0..31]: 2683128..2683159 flag(0x1009) -> merged, encoded, last_extent
> > > > (unwritten?)
> > >
> > > FIEMAP_EXTENT_UNWRITTEN already has a specific meaning; see
> > > Documentation/filesystems/fiemap.rst. It means that the data is all zeroes, and
> > > the disk space is preallocated but the data hasn't been written to disk yet.
> > >
> > > In this case, the data is *not* necessarily all zeroes. So I think
> > > FIEMAP_EXTENT_UNWRITTEN shouldn't be used here.
> > >
> > > > Do you want f2fs to print out the info on a cluster basis, even when
> > > > the user asks for one block information?
> > > > Like
> > > > If the user asks for the info of [8..15], f2fs will return the info of [0..31]?
> > >
> > > Yes, since that's how FS_IOC_FIEMAP is supposed to work; see
> > > Documentation/filesystems/fiemap.rst:
> > >
> > > All offsets and lengths are in bytes and mirror those on disk. It is
> > > valid for an extents logical offset to start before the request or its
> > > logical length to extend past the request.
> > >
> > > (That being said, the f2fs compression+encryption tests I've written don't
> > > exercise this case; they only map the whole file at once.)
> > >
> > > - Eric
> >
> > My last question is.
> > How about a discontinuous cluster like [0..31] maps to discontinuous
> > three blocks like physical address 0x4, 0x14 and 0x24.
> > I think we have to return three extents for the one logical region
> > like the below. What do you think?
> > [0..31] -> 0x4 (merged, encoded)
> > [0..31] -> 0x14 (merged, encoded)
> > [0..31] -> 0x24 (merged, encoded, last_extent)
>
> No, please don't do that. struct fiemap_extent only has a single length field,
> not separate lengths for fe_logical and fe_physical, so with your proposal there
> would be no way to know how many physical blocks to take from each extent. It
> would be reporting the same part of the file in contradictory ways.
>
> Like I suggested originally, I think this case should be reported like:
>
> fe_logical=0 fe_physical=16384 length=4096
> fe_logical=4096 fe_physical=81920 length=4096
> fe_logical=8192 fe_physical=147456 length=8192
>
> It's not perfect, but I think it's the least bad option, for the reasons I've
> explained previously...
>
> - Eric
Ok, I got your point. Let me try it again.
Thank you,
Btw, any chane you could switch f2fs to use iomap_fiemap instead of
the handrolled version? Especially with the work from Eric to add
iomap based direct I/O this should be much simpler now.
Good point. I already made my second version of this.
We could revisit later to make it simple.
Thank you,
On Wed, Jul 21, 2021 at 11:25 PM Christoph Hellwig <[email protected]> wrote:
>
> Btw, any chane you could switch f2fs to use iomap_fiemap instead of
> the handrolled version? Especially with the work from Eric to add
> iomap based direct I/O this should be much simpler now.