From: Zhang Yi <[email protected]>
When unaligned truncating down a realtime file which sb_rextsize is
bigger than one block, xfs_truncate_page() only zeros out the tail EOF
block, this could expose stale data since commit '943bc0882ceb ("iomap:
don't increase i_size if it's not a write operation")'.
If we truncate file that contains a large enough written extent:
|< rxext >|< rtext >|
...WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
^ (new EOF) ^ old EOF
Since we only zeros out the tail of the EOF block, and
xfs_itruncate_extents() unmap the whole ailgned extents, it becomes
this state:
|< rxext >|
...WWWzWWWWWWWWWWWWW
^ new EOF
Then if we do an extending write like this, the blocks in the previous
tail extent becomes stale:
|< rxext >|
...WWWzSSSSSSSSSSSSS..........WWWWWWWWWWWWWWWWW
^ old EOF ^ append start ^ new EOF
Fix this by zeroing out the tail allocation uint and also make sure
xfs_itruncate_extents() unmap rtextsize aligned extents.
Fixes: 943bc0882ceb ("iomap: don't increase i_size if it's not a write operation")
Reported-by: Chandan Babu R <[email protected]>
Link: https://lore.kernel.org/linux-xfs/[email protected]
Signed-off-by: Zhang Yi <[email protected]>
---
fs/xfs/xfs_inode.c | 3 +++
fs/xfs/xfs_iops.c | 2 +-
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 58fb7a5062e1..db35167acef6 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -35,6 +35,7 @@
#include "xfs_trans_priv.h"
#include "xfs_log.h"
#include "xfs_bmap_btree.h"
+#include "xfs_rtbitmap.h"
#include "xfs_reflink.h"
#include "xfs_ag.h"
#include "xfs_log_priv.h"
@@ -1512,6 +1513,8 @@ xfs_itruncate_extents_flags(
* the page cache can't scale that far.
*/
first_unmap_block = XFS_B_TO_FSB(mp, (xfs_ufsize_t)new_size);
+ if (xfs_inode_has_bigrtalloc(ip))
+ first_unmap_block = xfs_rtb_roundup_rtx(mp, first_unmap_block);
if (!xfs_verify_fileoff(mp, first_unmap_block)) {
WARN_ON_ONCE(first_unmap_block > XFS_MAX_FILEOFF);
return 0;
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index d24927075022..ec7b7bdf8825 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -865,7 +865,7 @@ xfs_setattr_size(
*/
write_back = newsize > ip->i_disk_size && oldsize != ip->i_disk_size;
if (newsize < oldsize) {
- unsigned int blocksize = i_blocksize(inode);
+ unsigned int blocksize = xfs_inode_alloc_unitsize(ip);
/*
* Zeroing out the partial EOF block and the rest of the extra
--
2.39.2
On Wed, May 29, 2024 at 05:52:04PM +0800, Zhang Yi wrote:
> + if (xfs_inode_has_bigrtalloc(ip))
> + first_unmap_block = xfs_rtb_roundup_rtx(mp, first_unmap_block);
Given that first_unmap_block is a xfs_fileoff_t and not a xfs_rtblock_t,
this looks a bit confusing. I'd suggest to just open code the
arithmetics in xfs_rtb_roundup_rtx. For future proofing my also
use xfs_inode_alloc_unitsize() as in the hunk below instead of hard
coding the rtextsize. I.e.:
first_unmap_block = XFS_B_TO_FSB(mp,
roundup_64(new_size, xfs_inode_alloc_unitsize(ip)));
On 2024/5/31 21:36, Christoph Hellwig wrote:
> On Wed, May 29, 2024 at 05:52:04PM +0800, Zhang Yi wrote:
>> + if (xfs_inode_has_bigrtalloc(ip))
>> + first_unmap_block = xfs_rtb_roundup_rtx(mp, first_unmap_block);
>
> Given that first_unmap_block is a xfs_fileoff_t and not a xfs_rtblock_t,
> this looks a bit confusing. I'd suggest to just open code the
> arithmetics in xfs_rtb_roundup_rtx. For future proofing my also
> use xfs_inode_alloc_unitsize() as in the hunk below instead of hard
> coding the rtextsize. I.e.:
>
> first_unmap_block = XFS_B_TO_FSB(mp,
> roundup_64(new_size, xfs_inode_alloc_unitsize(ip)));
>
Sure, makes sense to me.
Thanks,
Yi.