We had to grab the inode before retrieving i_ino.
Signed-off-by: Jaegeuk Kim <[email protected]>
---
fs/f2fs/file.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index a0a4413d6083b..9d4c3e3503567 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -168,6 +168,7 @@ static const struct vm_operations_struct f2fs_file_vm_ops = {
static int get_parent_ino(struct inode *inode, nid_t *pino)
{
struct dentry *dentry;
+ struct inode *parent;
inode = igrab(inode);
dentry = d_find_any_alias(inode);
@@ -175,8 +176,13 @@ static int get_parent_ino(struct inode *inode, nid_t *pino)
if (!dentry)
return 0;
- *pino = parent_ino(dentry);
+ parent = igrab(d_inode(dentry->d_parent));
dput(dentry);
+ if (!parent)
+ return 0;
+
+ *pino = parent->i_ino;
+ iput(parent);
return 1;
}
--
2.26.2.526.g744177e7f7-goog
On Tue, May 05, 2020 at 08:31:39AM -0700, Jaegeuk Kim wrote:
> We had to grab the inode before retrieving i_ino.
>
> Signed-off-by: Jaegeuk Kim <[email protected]>
> ---
> fs/f2fs/file.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> index a0a4413d6083b..9d4c3e3503567 100644
> --- a/fs/f2fs/file.c
> +++ b/fs/f2fs/file.c
> @@ -168,6 +168,7 @@ static const struct vm_operations_struct f2fs_file_vm_ops = {
> static int get_parent_ino(struct inode *inode, nid_t *pino)
> {
> struct dentry *dentry;
> + struct inode *parent;
>
> inode = igrab(inode);
> dentry = d_find_any_alias(inode);
> @@ -175,8 +176,13 @@ static int get_parent_ino(struct inode *inode, nid_t *pino)
> if (!dentry)
> return 0;
>
> - *pino = parent_ino(dentry);
> + parent = igrab(d_inode(dentry->d_parent));
> dput(dentry);
> + if (!parent)
> + return 0;
> +
> + *pino = parent->i_ino;
> + iput(parent);
> return 1;
This doesn't appear to be necessary. parent_ino() is:
spin_lock(&dentry->d_lock);
res = dentry->d_parent->d_inode->i_ino;
spin_unlock(&dentry->d_lock);
Since dentry is locked and referenced, ->d_parent is stable and positive.
In the encrypt+casefold patch I was reviewing, it's indeed necessary, but only
because there was a check of inode->i_flags added outside the locked region.
The following would be simpler:
spin_lock(&dentry->d_lock);
dir = dentry->d_parent->d_inode;
*pino = dir->i_ino;
needs_recovery = IS_ENCRYPTED(dir) && IS_CASEFOLDED(dir);
spin_unlock(&dentry->d_lock);
BTW, d_find_any_alias() is unnecessary too. This code should just be using
file_dentry(file) from f2fs_do_sync_file().
- Eric
On Tue, May 05, 2020 at 09:58:47AM -0700, Eric Biggers wrote:
> On Tue, May 05, 2020 at 08:31:39AM -0700, Jaegeuk Kim wrote:
> > We had to grab the inode before retrieving i_ino.
> >
> > Signed-off-by: Jaegeuk Kim <[email protected]>
> > ---
> > fs/f2fs/file.c | 8 +++++++-
> > 1 file changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> > index a0a4413d6083b..9d4c3e3503567 100644
> > --- a/fs/f2fs/file.c
> > +++ b/fs/f2fs/file.c
> > @@ -168,6 +168,7 @@ static const struct vm_operations_struct f2fs_file_vm_ops = {
> > static int get_parent_ino(struct inode *inode, nid_t *pino)
> > {
> > struct dentry *dentry;
> > + struct inode *parent;
> >
> > inode = igrab(inode);
> > dentry = d_find_any_alias(inode);
> > @@ -175,8 +176,13 @@ static int get_parent_ino(struct inode *inode, nid_t *pino)
> > if (!dentry)
> > return 0;
> >
> > - *pino = parent_ino(dentry);
> > + parent = igrab(d_inode(dentry->d_parent));
> > dput(dentry);
> > + if (!parent)
> > + return 0;
> > +
> > + *pino = parent->i_ino;
> > + iput(parent);
> > return 1;
>
> This doesn't appear to be necessary. parent_ino() is:
>
> spin_lock(&dentry->d_lock);
> res = dentry->d_parent->d_inode->i_ino;
> spin_unlock(&dentry->d_lock);
>
> Since dentry is locked and referenced, ->d_parent is stable and positive.
>
> In the encrypt+casefold patch I was reviewing, it's indeed necessary, but only
> because there was a check of inode->i_flags added outside the locked region.
> The following would be simpler:
>
> spin_lock(&dentry->d_lock);
> dir = dentry->d_parent->d_inode;
> *pino = dir->i_ino;
> needs_recovery = IS_ENCRYPTED(dir) && IS_CASEFOLDED(dir);
> spin_unlock(&dentry->d_lock);
>
> BTW, d_find_any_alias() is unnecessary too. This code should just be using
> file_dentry(file) from f2fs_do_sync_file().
>
Also, what is this code trying to accomplish? If it's trying to find the parent
directory of an inode with i_nlink == 1, this isn't the correct way to do it.
The fsync could be done via a deleted file, which would make the wrong p_ino be
set. I think the correct approach would be to iterate through all the dentry's
aliases, and choose the parent directory that's !IS_DEADDIR().
- Eric
On 05/05, Eric Biggers wrote:
> On Tue, May 05, 2020 at 08:31:39AM -0700, Jaegeuk Kim wrote:
> > We had to grab the inode before retrieving i_ino.
> >
> > Signed-off-by: Jaegeuk Kim <[email protected]>
> > ---
> > fs/f2fs/file.c | 8 +++++++-
> > 1 file changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> > index a0a4413d6083b..9d4c3e3503567 100644
> > --- a/fs/f2fs/file.c
> > +++ b/fs/f2fs/file.c
> > @@ -168,6 +168,7 @@ static const struct vm_operations_struct f2fs_file_vm_ops = {
> > static int get_parent_ino(struct inode *inode, nid_t *pino)
> > {
> > struct dentry *dentry;
> > + struct inode *parent;
> >
> > inode = igrab(inode);
> > dentry = d_find_any_alias(inode);
> > @@ -175,8 +176,13 @@ static int get_parent_ino(struct inode *inode, nid_t *pino)
> > if (!dentry)
> > return 0;
> >
> > - *pino = parent_ino(dentry);
> > + parent = igrab(d_inode(dentry->d_parent));
> > dput(dentry);
> > + if (!parent)
> > + return 0;
> > +
> > + *pino = parent->i_ino;
> > + iput(parent);
> > return 1;
Hi Eric,
>
> This doesn't appear to be necessary. parent_ino() is:
>
> spin_lock(&dentry->d_lock);
> res = dentry->d_parent->d_inode->i_ino;
> spin_unlock(&dentry->d_lock);
>
> Since dentry is locked and referenced, ->d_parent is stable and positive.
I see, thanks. :)
>
> In the encrypt+casefold patch I was reviewing, it's indeed necessary, but only
> because there was a check of inode->i_flags added outside the locked region.
> The following would be simpler:
>
> spin_lock(&dentry->d_lock);
> dir = dentry->d_parent->d_inode;
> *pino = dir->i_ino;
> needs_recovery = IS_ENCRYPTED(dir) && IS_CASEFOLDED(dir);
> spin_unlock(&dentry->d_lock);
Ack.
>
> BTW, d_find_any_alias() is unnecessary too. This code should just be using
> file_dentry(file) from f2fs_do_sync_file().
How about this?
From 9aee969a413b1ed22b48573071bc93fbb4a2002d Mon Sep 17 00:00:00 2001
From: Jaegeuk Kim <[email protected]>
Date: Tue, 5 May 2020 11:08:58 -0700
Subject: [PATCH] f2fs: remove unnecessary dentry locks
As Eric commented, let's kill unnecessary dentry ops when recovering
parent inode number.
Suggested-by: Eric Biggers <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
---
fs/f2fs/file.c | 26 ++++++--------------------
1 file changed, 6 insertions(+), 20 deletions(-)
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index a0a4413d6083b..711cebad36fc5 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -165,21 +165,6 @@ static const struct vm_operations_struct f2fs_file_vm_ops = {
.page_mkwrite = f2fs_vm_page_mkwrite,
};
-static int get_parent_ino(struct inode *inode, nid_t *pino)
-{
- struct dentry *dentry;
-
- inode = igrab(inode);
- dentry = d_find_any_alias(inode);
- iput(inode);
- if (!dentry)
- return 0;
-
- *pino = parent_ino(dentry);
- dput(dentry);
- return 1;
-}
-
static inline enum cp_reason_type need_do_checkpoint(struct inode *inode)
{
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
@@ -223,14 +208,15 @@ static bool need_inode_page_update(struct f2fs_sb_info *sbi, nid_t ino)
return ret;
}
-static void try_to_fix_pino(struct inode *inode)
+static void try_to_fix_pino(struct dentry *dentry)
{
+ struct inode *inode = d_inode(dentry);
struct f2fs_inode_info *fi = F2FS_I(inode);
- nid_t pino;
down_write(&fi->i_sem);
- if (file_wrong_pino(inode) && inode->i_nlink == 1 &&
- get_parent_ino(inode, &pino)) {
+ if (file_wrong_pino(inode) && inode->i_nlink == 1) {
+ nid_t pino = parent_ino(dentry);
+
f2fs_i_pino_write(inode, pino);
file_got_pino(inode);
}
@@ -310,7 +296,7 @@ static int f2fs_do_sync_file(struct file *file, loff_t start, loff_t end,
* We've secured consistency through sync_fs. Following pino
* will be used only for fsynced inodes after checkpoint.
*/
- try_to_fix_pino(inode);
+ try_to_fix_pino(file_dentry(file));
clear_inode_flag(inode, FI_APPEND_WRITE);
clear_inode_flag(inode, FI_UPDATE_WRITE);
goto out;
--
2.26.2.526.g744177e7f7-goog
On Tue, May 05, 2020 at 11:13:23AM -0700, Jaegeuk Kim wrote:
> On 05/05, Eric Biggers wrote:
> > On Tue, May 05, 2020 at 08:31:39AM -0700, Jaegeuk Kim wrote:
> > > We had to grab the inode before retrieving i_ino.
> > >
> > > Signed-off-by: Jaegeuk Kim <[email protected]>
> > > ---
> > > fs/f2fs/file.c | 8 +++++++-
> > > 1 file changed, 7 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> > > index a0a4413d6083b..9d4c3e3503567 100644
> > > --- a/fs/f2fs/file.c
> > > +++ b/fs/f2fs/file.c
> > > @@ -168,6 +168,7 @@ static const struct vm_operations_struct f2fs_file_vm_ops = {
> > > static int get_parent_ino(struct inode *inode, nid_t *pino)
> > > {
> > > struct dentry *dentry;
> > > + struct inode *parent;
> > >
> > > inode = igrab(inode);
> > > dentry = d_find_any_alias(inode);
> > > @@ -175,8 +176,13 @@ static int get_parent_ino(struct inode *inode, nid_t *pino)
> > > if (!dentry)
> > > return 0;
> > >
> > > - *pino = parent_ino(dentry);
> > > + parent = igrab(d_inode(dentry->d_parent));
> > > dput(dentry);
> > > + if (!parent)
> > > + return 0;
> > > +
> > > + *pino = parent->i_ino;
> > > + iput(parent);
> > > return 1;
>
> Hi Eric,
>
> >
> > This doesn't appear to be necessary. parent_ino() is:
> >
> > spin_lock(&dentry->d_lock);
> > res = dentry->d_parent->d_inode->i_ino;
> > spin_unlock(&dentry->d_lock);
> >
> > Since dentry is locked and referenced, ->d_parent is stable and positive.
>
> I see, thanks. :)
>
> >
> > In the encrypt+casefold patch I was reviewing, it's indeed necessary, but only
> > because there was a check of inode->i_flags added outside the locked region.
> > The following would be simpler:
> >
> > spin_lock(&dentry->d_lock);
> > dir = dentry->d_parent->d_inode;
> > *pino = dir->i_ino;
> > needs_recovery = IS_ENCRYPTED(dir) && IS_CASEFOLDED(dir);
> > spin_unlock(&dentry->d_lock);
>
> Ack.
>
> >
> > BTW, d_find_any_alias() is unnecessary too. This code should just be using
> > file_dentry(file) from f2fs_do_sync_file().
>
> How about this?
>
> From 9aee969a413b1ed22b48573071bc93fbb4a2002d Mon Sep 17 00:00:00 2001
> From: Jaegeuk Kim <[email protected]>
> Date: Tue, 5 May 2020 11:08:58 -0700
> Subject: [PATCH] f2fs: remove unnecessary dentry locks
>
> As Eric commented, let's kill unnecessary dentry ops when recovering
> parent inode number.
>
> Suggested-by: Eric Biggers <[email protected]>
> Signed-off-by: Jaegeuk Kim <[email protected]>
> ---
> fs/f2fs/file.c | 26 ++++++--------------------
> 1 file changed, 6 insertions(+), 20 deletions(-)
>
> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> index a0a4413d6083b..711cebad36fc5 100644
> --- a/fs/f2fs/file.c
> +++ b/fs/f2fs/file.c
> @@ -165,21 +165,6 @@ static const struct vm_operations_struct f2fs_file_vm_ops = {
> .page_mkwrite = f2fs_vm_page_mkwrite,
> };
>
> -static int get_parent_ino(struct inode *inode, nid_t *pino)
> -{
> - struct dentry *dentry;
> -
> - inode = igrab(inode);
> - dentry = d_find_any_alias(inode);
> - iput(inode);
> - if (!dentry)
> - return 0;
> -
> - *pino = parent_ino(dentry);
> - dput(dentry);
> - return 1;
> -}
> -
> static inline enum cp_reason_type need_do_checkpoint(struct inode *inode)
> {
> struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
> @@ -223,14 +208,15 @@ static bool need_inode_page_update(struct f2fs_sb_info *sbi, nid_t ino)
> return ret;
> }
>
> -static void try_to_fix_pino(struct inode *inode)
> +static void try_to_fix_pino(struct dentry *dentry)
> {
> + struct inode *inode = d_inode(dentry);
> struct f2fs_inode_info *fi = F2FS_I(inode);
> - nid_t pino;
>
> down_write(&fi->i_sem);
> - if (file_wrong_pino(inode) && inode->i_nlink == 1 &&
> - get_parent_ino(inode, &pino)) {
> + if (file_wrong_pino(inode) && inode->i_nlink == 1) {
> + nid_t pino = parent_ino(dentry);
> +
> f2fs_i_pino_write(inode, pino);
> file_got_pino(inode);
> }
> @@ -310,7 +296,7 @@ static int f2fs_do_sync_file(struct file *file, loff_t start, loff_t end,
> * We've secured consistency through sync_fs. Following pino
> * will be used only for fsynced inodes after checkpoint.
> */
> - try_to_fix_pino(inode);
> + try_to_fix_pino(file_dentry(file));
> clear_inode_flag(inode, FI_APPEND_WRITE);
> clear_inode_flag(inode, FI_UPDATE_WRITE);
> goto out;
Actually, I think this is wrong because the fsync can be done via a file
descriptor that was opened to a now-deleted link to the file.
We need to find the dentry whose parent directory is still exists, i.e. the
parent directory that is counting towards 'inode->i_nlink == 1'.
I think d_find_alias() is what we're looking for.
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 6ab8f621a3c5..855f27468baa 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -165,13 +165,13 @@ static int get_parent_ino(struct inode *inode, nid_t *pino)
{
struct dentry *dentry;
- inode = igrab(inode);
- dentry = d_find_any_alias(inode);
- iput(inode);
+ dentry = d_find_alias(inode);
if (!dentry)
return 0;
On 05/05, Eric Biggers wrote:
> On Tue, May 05, 2020 at 09:58:47AM -0700, Eric Biggers wrote:
> > On Tue, May 05, 2020 at 08:31:39AM -0700, Jaegeuk Kim wrote:
> > > We had to grab the inode before retrieving i_ino.
> > >
> > > Signed-off-by: Jaegeuk Kim <[email protected]>
> > > ---
> > > fs/f2fs/file.c | 8 +++++++-
> > > 1 file changed, 7 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> > > index a0a4413d6083b..9d4c3e3503567 100644
> > > --- a/fs/f2fs/file.c
> > > +++ b/fs/f2fs/file.c
> > > @@ -168,6 +168,7 @@ static const struct vm_operations_struct f2fs_file_vm_ops = {
> > > static int get_parent_ino(struct inode *inode, nid_t *pino)
> > > {
> > > struct dentry *dentry;
> > > + struct inode *parent;
> > >
> > > inode = igrab(inode);
> > > dentry = d_find_any_alias(inode);
> > > @@ -175,8 +176,13 @@ static int get_parent_ino(struct inode *inode, nid_t *pino)
> > > if (!dentry)
> > > return 0;
> > >
> > > - *pino = parent_ino(dentry);
> > > + parent = igrab(d_inode(dentry->d_parent));
> > > dput(dentry);
> > > + if (!parent)
> > > + return 0;
> > > +
> > > + *pino = parent->i_ino;
> > > + iput(parent);
> > > return 1;
> >
> > This doesn't appear to be necessary. parent_ino() is:
> >
> > spin_lock(&dentry->d_lock);
> > res = dentry->d_parent->d_inode->i_ino;
> > spin_unlock(&dentry->d_lock);
> >
> > Since dentry is locked and referenced, ->d_parent is stable and positive.
> >
> > In the encrypt+casefold patch I was reviewing, it's indeed necessary, but only
> > because there was a check of inode->i_flags added outside the locked region.
> > The following would be simpler:
> >
> > spin_lock(&dentry->d_lock);
> > dir = dentry->d_parent->d_inode;
> > *pino = dir->i_ino;
> > needs_recovery = IS_ENCRYPTED(dir) && IS_CASEFOLDED(dir);
> > spin_unlock(&dentry->d_lock);
> >
> > BTW, d_find_any_alias() is unnecessary too. This code should just be using
> > file_dentry(file) from f2fs_do_sync_file().
> >
>
> Also, what is this code trying to accomplish? If it's trying to find the parent
> directory of an inode with i_nlink == 1, this isn't the correct way to do it.
> The fsync could be done via a deleted file, which would make the wrong p_ino be
> set. I think the correct approach would be to iterate through all the dentry's
> aliases, and choose the parent directory that's !IS_DEADDIR().
The intention is to give a chance to recover the pino to avoid performance
drop on fsync() by avoiding checkpoint(). And the purpose of this is to
find a file having single linked file. Otherwise, we'll do checkpoint().
>
> - Eric
On 05/05, Eric Biggers wrote:
> On Tue, May 05, 2020 at 11:13:23AM -0700, Jaegeuk Kim wrote:
> > On 05/05, Eric Biggers wrote:
> > > On Tue, May 05, 2020 at 08:31:39AM -0700, Jaegeuk Kim wrote:
> > > > We had to grab the inode before retrieving i_ino.
> > > >
> > > > Signed-off-by: Jaegeuk Kim <[email protected]>
> > > > ---
> > > > fs/f2fs/file.c | 8 +++++++-
> > > > 1 file changed, 7 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> > > > index a0a4413d6083b..9d4c3e3503567 100644
> > > > --- a/fs/f2fs/file.c
> > > > +++ b/fs/f2fs/file.c
> > > > @@ -168,6 +168,7 @@ static const struct vm_operations_struct f2fs_file_vm_ops = {
> > > > static int get_parent_ino(struct inode *inode, nid_t *pino)
> > > > {
> > > > struct dentry *dentry;
> > > > + struct inode *parent;
> > > >
> > > > inode = igrab(inode);
> > > > dentry = d_find_any_alias(inode);
> > > > @@ -175,8 +176,13 @@ static int get_parent_ino(struct inode *inode, nid_t *pino)
> > > > if (!dentry)
> > > > return 0;
> > > >
> > > > - *pino = parent_ino(dentry);
> > > > + parent = igrab(d_inode(dentry->d_parent));
> > > > dput(dentry);
> > > > + if (!parent)
> > > > + return 0;
> > > > +
> > > > + *pino = parent->i_ino;
> > > > + iput(parent);
> > > > return 1;
> >
> > Hi Eric,
> >
> > >
> > > This doesn't appear to be necessary. parent_ino() is:
> > >
> > > spin_lock(&dentry->d_lock);
> > > res = dentry->d_parent->d_inode->i_ino;
> > > spin_unlock(&dentry->d_lock);
> > >
> > > Since dentry is locked and referenced, ->d_parent is stable and positive.
> >
> > I see, thanks. :)
> >
> > >
> > > In the encrypt+casefold patch I was reviewing, it's indeed necessary, but only
> > > because there was a check of inode->i_flags added outside the locked region.
> > > The following would be simpler:
> > >
> > > spin_lock(&dentry->d_lock);
> > > dir = dentry->d_parent->d_inode;
> > > *pino = dir->i_ino;
> > > needs_recovery = IS_ENCRYPTED(dir) && IS_CASEFOLDED(dir);
> > > spin_unlock(&dentry->d_lock);
> >
> > Ack.
> >
> > >
> > > BTW, d_find_any_alias() is unnecessary too. This code should just be using
> > > file_dentry(file) from f2fs_do_sync_file().
> >
> > How about this?
> >
> > From 9aee969a413b1ed22b48573071bc93fbb4a2002d Mon Sep 17 00:00:00 2001
> > From: Jaegeuk Kim <[email protected]>
> > Date: Tue, 5 May 2020 11:08:58 -0700
> > Subject: [PATCH] f2fs: remove unnecessary dentry locks
> >
> > As Eric commented, let's kill unnecessary dentry ops when recovering
> > parent inode number.
> >
> > Suggested-by: Eric Biggers <[email protected]>
> > Signed-off-by: Jaegeuk Kim <[email protected]>
> > ---
> > fs/f2fs/file.c | 26 ++++++--------------------
> > 1 file changed, 6 insertions(+), 20 deletions(-)
> >
> > diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> > index a0a4413d6083b..711cebad36fc5 100644
> > --- a/fs/f2fs/file.c
> > +++ b/fs/f2fs/file.c
> > @@ -165,21 +165,6 @@ static const struct vm_operations_struct f2fs_file_vm_ops = {
> > .page_mkwrite = f2fs_vm_page_mkwrite,
> > };
> >
> > -static int get_parent_ino(struct inode *inode, nid_t *pino)
> > -{
> > - struct dentry *dentry;
> > -
> > - inode = igrab(inode);
> > - dentry = d_find_any_alias(inode);
> > - iput(inode);
> > - if (!dentry)
> > - return 0;
> > -
> > - *pino = parent_ino(dentry);
> > - dput(dentry);
> > - return 1;
> > -}
> > -
> > static inline enum cp_reason_type need_do_checkpoint(struct inode *inode)
> > {
> > struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
> > @@ -223,14 +208,15 @@ static bool need_inode_page_update(struct f2fs_sb_info *sbi, nid_t ino)
> > return ret;
> > }
> >
> > -static void try_to_fix_pino(struct inode *inode)
> > +static void try_to_fix_pino(struct dentry *dentry)
> > {
> > + struct inode *inode = d_inode(dentry);
> > struct f2fs_inode_info *fi = F2FS_I(inode);
> > - nid_t pino;
> >
> > down_write(&fi->i_sem);
> > - if (file_wrong_pino(inode) && inode->i_nlink == 1 &&
> > - get_parent_ino(inode, &pino)) {
> > + if (file_wrong_pino(inode) && inode->i_nlink == 1) {
> > + nid_t pino = parent_ino(dentry);
> > +
> > f2fs_i_pino_write(inode, pino);
> > file_got_pino(inode);
> > }
> > @@ -310,7 +296,7 @@ static int f2fs_do_sync_file(struct file *file, loff_t start, loff_t end,
> > * We've secured consistency through sync_fs. Following pino
> > * will be used only for fsynced inodes after checkpoint.
> > */
> > - try_to_fix_pino(inode);
> > + try_to_fix_pino(file_dentry(file));
> > clear_inode_flag(inode, FI_APPEND_WRITE);
> > clear_inode_flag(inode, FI_UPDATE_WRITE);
> > goto out;
>
> Actually, I think this is wrong because the fsync can be done via a file
> descriptor that was opened to a now-deleted link to the file.
>
> We need to find the dentry whose parent directory is still exists, i.e. the
> parent directory that is counting towards 'inode->i_nlink == 1'.
>
> I think d_find_alias() is what we're looking for.
Yeah, it seems to happen when open/rename/fsync calls, or race condition of
file deletion.
> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> index 6ab8f621a3c5..855f27468baa 100644
> --- a/fs/f2fs/file.c
> +++ b/fs/f2fs/file.c
> @@ -165,13 +165,13 @@ static int get_parent_ino(struct inode *inode, nid_t *pino)
> {
> struct dentry *dentry;
>
> - inode = igrab(inode);
> - dentry = d_find_any_alias(inode);
> - iput(inode);
> + dentry = d_find_alias(inode);
> if (!dentry)
> return 0;
How about this?
From 2a6b0e53e592854306062a2dc35db2d8f79062f2 Mon Sep 17 00:00:00 2001
From: Jaegeuk Kim <[email protected]>
Date: Tue, 5 May 2020 11:33:29 -0700
Subject: [PATCH] f2fs: find a living dentry when finding parent ino
We need to check any dentry still alive to get parent inode number.
Suggested-by: Eric Biggers <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
---
fs/f2fs/file.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index a0a4413d6083b..95139cb85faca 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -169,9 +169,8 @@ static int get_parent_ino(struct inode *inode, nid_t *pino)
{
struct dentry *dentry;
- inode = igrab(inode);
- dentry = d_find_any_alias(inode);
- iput(inode);
+ /* Need to check if valid dentry still exists. */
+ dentry = d_find_alias(inode);
if (!dentry)
return 0;
--
2.26.2.526.g744177e7f7-goog
On Tue, May 05, 2020 at 11:49:32AM -0700, Jaegeuk Kim wrote:
> How about this?
>
> From 2a6b0e53e592854306062a2dc35db2d8f79062f2 Mon Sep 17 00:00:00 2001
> From: Jaegeuk Kim <[email protected]>
> Date: Tue, 5 May 2020 11:33:29 -0700
> Subject: [PATCH] f2fs: find a living dentry when finding parent ino
>
> We need to check any dentry still alive to get parent inode number.
>
> Suggested-by: Eric Biggers <[email protected]>
> Signed-off-by: Jaegeuk Kim <[email protected]>
> ---
> fs/f2fs/file.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> index a0a4413d6083b..95139cb85faca 100644
> --- a/fs/f2fs/file.c
> +++ b/fs/f2fs/file.c
> @@ -169,9 +169,8 @@ static int get_parent_ino(struct inode *inode, nid_t *pino)
> {
> struct dentry *dentry;
>
> - inode = igrab(inode);
> - dentry = d_find_any_alias(inode);
> - iput(inode);
> + /* Need to check if valid dentry still exists. */
> + dentry = d_find_alias(inode);
> if (!dentry)
> return 0;
>
It's fine, but it could use some more explanation. (What's a "valid dentry"?)
How about the following?
From f8fe7d57eead1423e8548ac7a5ec881d701466a5 Mon Sep 17 00:00:00 2001
From: Eric Biggers <[email protected]>
Date: Tue, 5 May 2020 11:41:11 -0700
Subject: [PATCH] f2fs: correctly fix the parent inode number during fsync()
fsync() may be called on a deleted file that's still open. So when
fsync() tries to set the parent inode number when the inode has
LOST_PINO and i_nlink == 1 (to avoid later checkpoints), it needs to
make sure to get the parent directory via a non-deleted alias.
Also remove the unnecessary igrab() and iput(), as the caller already
holds a reference to the inode.
Signed-off-by: Eric Biggers <[email protected]>
---
fs/f2fs/file.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 6ab8f621a3c5a2..b3069188fd3478 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -165,9 +165,11 @@ static int get_parent_ino(struct inode *inode, nid_t *pino)
{
struct dentry *dentry;
- inode = igrab(inode);
- dentry = d_find_any_alias(inode);
- iput(inode);
+ /*
+ * Make sure to get the non-deleted alias. The alias associated with
+ * the open file descriptor being fsync()'ed may be deleted already.
+ */
+ dentry = d_find_alias(inode);
if (!dentry)
return 0;
--
2.26.2
On 05/05, Eric Biggers wrote:
> On Tue, May 05, 2020 at 11:49:32AM -0700, Jaegeuk Kim wrote:
> > How about this?
> >
> > From 2a6b0e53e592854306062a2dc35db2d8f79062f2 Mon Sep 17 00:00:00 2001
> > From: Jaegeuk Kim <[email protected]>
> > Date: Tue, 5 May 2020 11:33:29 -0700
> > Subject: [PATCH] f2fs: find a living dentry when finding parent ino
> >
> > We need to check any dentry still alive to get parent inode number.
> >
> > Suggested-by: Eric Biggers <[email protected]>
> > Signed-off-by: Jaegeuk Kim <[email protected]>
> > ---
> > fs/f2fs/file.c | 5 ++---
> > 1 file changed, 2 insertions(+), 3 deletions(-)
> >
> > diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> > index a0a4413d6083b..95139cb85faca 100644
> > --- a/fs/f2fs/file.c
> > +++ b/fs/f2fs/file.c
> > @@ -169,9 +169,8 @@ static int get_parent_ino(struct inode *inode, nid_t *pino)
> > {
> > struct dentry *dentry;
> >
> > - inode = igrab(inode);
> > - dentry = d_find_any_alias(inode);
> > - iput(inode);
> > + /* Need to check if valid dentry still exists. */
> > + dentry = d_find_alias(inode);
> > if (!dentry)
> > return 0;
> >
>
> It's fine, but it could use some more explanation. (What's a "valid dentry"?)
> How about the following?
Cool, I took this. Thanks,
>
> >From f8fe7d57eead1423e8548ac7a5ec881d701466a5 Mon Sep 17 00:00:00 2001
> From: Eric Biggers <[email protected]>
> Date: Tue, 5 May 2020 11:41:11 -0700
> Subject: [PATCH] f2fs: correctly fix the parent inode number during fsync()
>
> fsync() may be called on a deleted file that's still open. So when
> fsync() tries to set the parent inode number when the inode has
> LOST_PINO and i_nlink == 1 (to avoid later checkpoints), it needs to
> make sure to get the parent directory via a non-deleted alias.
>
> Also remove the unnecessary igrab() and iput(), as the caller already
> holds a reference to the inode.
>
> Signed-off-by: Eric Biggers <[email protected]>
> ---
> fs/f2fs/file.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> index 6ab8f621a3c5a2..b3069188fd3478 100644
> --- a/fs/f2fs/file.c
> +++ b/fs/f2fs/file.c
> @@ -165,9 +165,11 @@ static int get_parent_ino(struct inode *inode, nid_t *pino)
> {
> struct dentry *dentry;
>
> - inode = igrab(inode);
> - dentry = d_find_any_alias(inode);
> - iput(inode);
> + /*
> + * Make sure to get the non-deleted alias. The alias associated with
> + * the open file descriptor being fsync()'ed may be deleted already.
> + */
> + dentry = d_find_alias(inode);
> if (!dentry)
> return 0;
>
> --
> 2.26.2
Hi Eric,
On Tue, May 05, 2020 at 11:19:41AM -0700, Eric Biggers wrote:
> On Tue, May 05, 2020 at 11:13:23AM -0700, Jaegeuk Kim wrote:
...
> >
> > -static int get_parent_ino(struct inode *inode, nid_t *pino)
> > -{
> > - struct dentry *dentry;
> > -
> > - inode = igrab(inode);
> > - dentry = d_find_any_alias(inode);
> > - iput(inode);
> > - if (!dentry)
> > - return 0;
> > -
> > - *pino = parent_ino(dentry);
> > - dput(dentry);
> > - return 1;
> > -}
> > -
> > static inline enum cp_reason_type need_do_checkpoint(struct inode *inode)
> > {
> > struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
> > @@ -223,14 +208,15 @@ static bool need_inode_page_update(struct f2fs_sb_info *sbi, nid_t ino)
> > return ret;
> > }
> >
> > -static void try_to_fix_pino(struct inode *inode)
> > +static void try_to_fix_pino(struct dentry *dentry)
> > {
> > + struct inode *inode = d_inode(dentry);
> > struct f2fs_inode_info *fi = F2FS_I(inode);
> > - nid_t pino;
> >
> > down_write(&fi->i_sem);
> > - if (file_wrong_pino(inode) && inode->i_nlink == 1 &&
> > - get_parent_ino(inode, &pino)) {
> > + if (file_wrong_pino(inode) && inode->i_nlink == 1) {
> > + nid_t pino = parent_ino(dentry);
> > +
> > f2fs_i_pino_write(inode, pino);
> > file_got_pino(inode);
> > }
> > @@ -310,7 +296,7 @@ static int f2fs_do_sync_file(struct file *file, loff_t start, loff_t end,
> > * We've secured consistency through sync_fs. Following pino
> > * will be used only for fsynced inodes after checkpoint.
> > */
> > - try_to_fix_pino(inode);
> > + try_to_fix_pino(file_dentry(file));
> > clear_inode_flag(inode, FI_APPEND_WRITE);
> > clear_inode_flag(inode, FI_UPDATE_WRITE);
> > goto out;
>
> Actually, I think this is wrong because the fsync can be done via a file
> descriptor that was opened to a now-deleted link to the file.
I'm still confused about this...
I don't know what's wrong with this version from my limited knowledge?
inode itself is locked when fsyncing, so
if the fsync inode->i_nlink == 1, this inode has only one hard link
(not deleted yet) and should belong to a single directory; and
the only one parent directory would not go away (not deleted as well)
since there are some dirents in it (not empty).
Could kindly explain more so I would learn more about this scenario?
Thanks a lot!
>
> We need to find the dentry whose parent directory is still exists, i.e. the
> parent directory that is counting towards 'inode->i_nlink == 1'.
directory counting towards 'inode->i_nlink == 1', what's happening?
>
> I think d_find_alias() is what we're looking for.
It may be simply dentry->d_parent (stable/positive as you said before, and it's
not empty). why need to d_find_alias()?
And what is the original problem? I could not get some clue from the original
patch description (I only saw some extra igrab/iput because of some unknown
reasons), it there some backtrace related to the problem?
Thanks,
Gao Xiang
>
> diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
> index 6ab8f621a3c5..855f27468baa 100644
> --- a/fs/f2fs/file.c
> +++ b/fs/f2fs/file.c
> @@ -165,13 +165,13 @@ static int get_parent_ino(struct inode *inode, nid_t *pino)
> {
> struct dentry *dentry;
>
> - inode = igrab(inode);
> - dentry = d_find_any_alias(inode);
> - iput(inode);
> + dentry = d_find_alias(inode);
> if (!dentry)
> return 0;
>
>
>
> _______________________________________________
> Linux-f2fs-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
On Wed, May 06, 2020 at 08:14:07AM +0800, Gao Xiang wrote:
> >
> > Actually, I think this is wrong because the fsync can be done via a file
> > descriptor that was opened to a now-deleted link to the file.
>
> I'm still confused about this...
>
> I don't know what's wrong with this version from my limited knowledge?
> inode itself is locked when fsyncing, so
>
> if the fsync inode->i_nlink == 1, this inode has only one hard link
> (not deleted yet) and should belong to a single directory; and
>
> the only one parent directory would not go away (not deleted as well)
> since there are some dirents in it (not empty).
>
> Could kindly explain more so I would learn more about this scenario?
> Thanks a lot!
i_nlink == 1 just means that there is one non-deleted link. There can be links
that have since been deleted, and file descriptors can still be open to them.
>
> >
> > We need to find the dentry whose parent directory is still exists, i.e. the
> > parent directory that is counting towards 'inode->i_nlink == 1'.
>
> directory counting towards 'inode->i_nlink == 1', what's happening?
The non-deleted link is the one counted in i_nlink.
>
> >
> > I think d_find_alias() is what we're looking for.
>
> It may be simply dentry->d_parent (stable/positive as you said before, and it's
> not empty). why need to d_find_alias()?
Because we need to get the dentry that hasn't been deleted yet, which isn't
necessarily the one associated with the file descriptor being fsync()'ed.
> And what is the original problem? I could not get some clue from the original
> patch description (I only saw some extra igrab/iput because of some unknown
> reasons), it there some backtrace related to the problem?
The problem is that i_pino gets set incorrectly. I just noticed this while
reviewing the code. It's not hard to reproduce, e.g.:
#include <unistd.h>
#include <fcntl.h>
#include <sys/stat.h>
int main()
{
int fd;
mkdir("dir1", 0700);
mkdir("dir2", 0700);
mknod("dir1/file", S_IFREG|0600, 0);
link("dir1/file", "dir2/file");
fd = open("dir2/file", O_WRONLY);
unlink("dir2/file");
write(fd, "X", 1);
fsync(fd);
}
Then:
sync
echo N | dump.f2fs -i $(stat -c %i dir1/file) /dev/vdb | grep 'i_pino'
echo "dir1 (correct): $(stat -c %i dir1)"
echo "dir2 (wrong): $(stat -c %i dir2)"
i_pino will point to dir2 rather than dir1 as expected.
- Eric
On Tue, May 05, 2020 at 06:24:28PM -0700, Eric Biggers wrote:
> On Wed, May 06, 2020 at 08:14:07AM +0800, Gao Xiang wrote:
> > >
> > > Actually, I think this is wrong because the fsync can be done via a file
> > > descriptor that was opened to a now-deleted link to the file.
> >
> > I'm still confused about this...
> >
> > I don't know what's wrong with this version from my limited knowledge?
> > inode itself is locked when fsyncing, so
> >
> > if the fsync inode->i_nlink == 1, this inode has only one hard link
> > (not deleted yet) and should belong to a single directory; and
> >
> > the only one parent directory would not go away (not deleted as well)
> > since there are some dirents in it (not empty).
> >
> > Could kindly explain more so I would learn more about this scenario?
> > Thanks a lot!
>
> i_nlink == 1 just means that there is one non-deleted link. There can be links
> that have since been deleted, and file descriptors can still be open to them.
Thanks for your inspiration. You are right, thanks.
Correct my words... I didn't check f2fs code just now, it seems f2fs doesn't
take inode_lock as some other fs like __generic_file_fsync or ubifs_fsync.
And i_sem locks nlink / try_to_fix_pino similarly in some extent. It seems
no race by using d_find_alias here. Thanks again.
Thanks,
Gao Xiang
>
> >
> > >
> > > We need to find the dentry whose parent directory is still exists, i.e. the
> > > parent directory that is counting towards 'inode->i_nlink == 1'.
> >
> > directory counting towards 'inode->i_nlink == 1', what's happening?
>
> The non-deleted link is the one counted in i_nlink.
>
> >
> > >
> > > I think d_find_alias() is what we're looking for.
> >
> > It may be simply dentry->d_parent (stable/positive as you said before, and it's
> > not empty). why need to d_find_alias()?
>
> Because we need to get the dentry that hasn't been deleted yet, which isn't
> necessarily the one associated with the file descriptor being fsync()'ed.
>
> > And what is the original problem? I could not get some clue from the original
> > patch description (I only saw some extra igrab/iput because of some unknown
> > reasons), it there some backtrace related to the problem?
>
> The problem is that i_pino gets set incorrectly. I just noticed this while
> reviewing the code. It's not hard to reproduce, e.g.:
>
> #include <unistd.h>
> #include <fcntl.h>
> #include <sys/stat.h>
>
> int main()
> {
> int fd;
>
> mkdir("dir1", 0700);
> mkdir("dir2", 0700);
> mknod("dir1/file", S_IFREG|0600, 0);
> link("dir1/file", "dir2/file");
> fd = open("dir2/file", O_WRONLY);
> unlink("dir2/file");
> write(fd, "X", 1);
> fsync(fd);
> }
>
> Then:
>
> sync
> echo N | dump.f2fs -i $(stat -c %i dir1/file) /dev/vdb | grep 'i_pino'
> echo "dir1 (correct): $(stat -c %i dir1)"
> echo "dir2 (wrong): $(stat -c %i dir2)"
>
> i_pino will point to dir2 rather than dir1 as expected.
>
> - Eric
>
>
> _______________________________________________
> Linux-f2fs-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
On Wed, May 06, 2020 at 09:58:22AM +0800, Gao Xiang wrote:
> On Tue, May 05, 2020 at 06:24:28PM -0700, Eric Biggers wrote:
> > On Wed, May 06, 2020 at 08:14:07AM +0800, Gao Xiang wrote:
> > > >
> > > > Actually, I think this is wrong because the fsync can be done via a file
> > > > descriptor that was opened to a now-deleted link to the file.
> > >
> > > I'm still confused about this...
> > >
> > > I don't know what's wrong with this version from my limited knowledge?
> > > inode itself is locked when fsyncing, so
> > >
> > > if the fsync inode->i_nlink == 1, this inode has only one hard link
> > > (not deleted yet) and should belong to a single directory; and
> > >
> > > the only one parent directory would not go away (not deleted as well)
> > > since there are some dirents in it (not empty).
> > >
> > > Could kindly explain more so I would learn more about this scenario?
> > > Thanks a lot!
> >
> > i_nlink == 1 just means that there is one non-deleted link. There can be links
> > that have since been deleted, and file descriptors can still be open to them.
>
> Thanks for your inspiration. You are right, thanks.
>
> Correct my words... I didn't check f2fs code just now, it seems f2fs doesn't
> take inode_lock as some other fs like __generic_file_fsync or ubifs_fsync.
>
> And i_sem locks nlink / try_to_fix_pino similarly in some extent. It seems
> no race by using d_find_alias here. Thanks again.
>
(think more little bit just now...)
Thread 1: Thread 2 (fsync):
vfs_unlink try_to_fix_pino
f2fs_unlink
f2fs_delete_entry
f2fs_drop_nlink (i_sem, inode->i_nlink = 1)
(... but this dentry still hashed) i_sem, check inode->i_nlink = 1
i_sem d_find_alias
d_delete
I'm not sure if fsync could still use some wrong alias by chance..
completely untested, maybe just noise...
Thanks,
Gao Xiang
On 2020/5/6 9:24, Eric Biggers wrote:
> On Wed, May 06, 2020 at 08:14:07AM +0800, Gao Xiang wrote:
>>>
>>> Actually, I think this is wrong because the fsync can be done via a file
>>> descriptor that was opened to a now-deleted link to the file.
>>
>> I'm still confused about this...
>>
>> I don't know what's wrong with this version from my limited knowledge?
>> inode itself is locked when fsyncing, so
>>
>> if the fsync inode->i_nlink == 1, this inode has only one hard link
>> (not deleted yet) and should belong to a single directory; and
>>
>> the only one parent directory would not go away (not deleted as well)
>> since there are some dirents in it (not empty).
>>
>> Could kindly explain more so I would learn more about this scenario?
>> Thanks a lot!
>
> i_nlink == 1 just means that there is one non-deleted link. There can be links
> that have since been deleted, and file descriptors can still be open to them.
>
>>
>>>
>>> We need to find the dentry whose parent directory is still exists, i.e. the
>>> parent directory that is counting towards 'inode->i_nlink == 1'.
>>
>> directory counting towards 'inode->i_nlink == 1', what's happening?
>
> The non-deleted link is the one counted in i_nlink.
>
>>
>>>
>>> I think d_find_alias() is what we're looking for.
>>
>> It may be simply dentry->d_parent (stable/positive as you said before, and it's
>> not empty). why need to d_find_alias()?
>
> Because we need to get the dentry that hasn't been deleted yet, which isn't
> necessarily the one associated with the file descriptor being fsync()'ed.
>
>> And what is the original problem? I could not get some clue from the original
>> patch description (I only saw some extra igrab/iput because of some unknown
>> reasons), it there some backtrace related to the problem?
>
> The problem is that i_pino gets set incorrectly. I just noticed this while
> reviewing the code. It's not hard to reproduce, e.g.:
>
> #include <unistd.h>
> #include <fcntl.h>
> #include <sys/stat.h>
>
> int main()
> {
> int fd;
>
> mkdir("dir1", 0700);
> mkdir("dir2", 0700);
> mknod("dir1/file", S_IFREG|0600, 0);
> link("dir1/file", "dir2/file");
> fd = open("dir2/file", O_WRONLY);
> unlink("dir2/file");
> write(fd, "X", 1);
> fsync(fd);
> }
>
> Then:
>
> sync
> echo N | dump.f2fs -i $(stat -c %i dir1/file) /dev/vdb | grep 'i_pino'
> echo "dir1 (correct): $(stat -c %i dir1)"
> echo "dir2 (wrong): $(stat -c %i dir2)"
>
> i_pino will point to dir2 rather than dir1 as expected.
Could you add above testcase into commit message of your patch? it will
be easier to understand the issue we solved with it.
In addition, how about adding this testcase in fstest as a generic one?
>
> - Eric
>
>
> _______________________________________________
> Linux-f2fs-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> .
>
On Wed, May 06, 2020 at 02:47:19PM +0800, Gao Xiang wrote:
> On Wed, May 06, 2020 at 09:58:22AM +0800, Gao Xiang wrote:
> > On Tue, May 05, 2020 at 06:24:28PM -0700, Eric Biggers wrote:
> > > On Wed, May 06, 2020 at 08:14:07AM +0800, Gao Xiang wrote:
> > > > >
> > > > > Actually, I think this is wrong because the fsync can be done via a file
> > > > > descriptor that was opened to a now-deleted link to the file.
> > > >
> > > > I'm still confused about this...
> > > >
> > > > I don't know what's wrong with this version from my limited knowledge?
> > > > inode itself is locked when fsyncing, so
> > > >
> > > > if the fsync inode->i_nlink == 1, this inode has only one hard link
> > > > (not deleted yet) and should belong to a single directory; and
> > > >
> > > > the only one parent directory would not go away (not deleted as well)
> > > > since there are some dirents in it (not empty).
> > > >
> > > > Could kindly explain more so I would learn more about this scenario?
> > > > Thanks a lot!
> > >
> > > i_nlink == 1 just means that there is one non-deleted link. There can be links
> > > that have since been deleted, and file descriptors can still be open to them.
> >
> > Thanks for your inspiration. You are right, thanks.
> >
> > Correct my words... I didn't check f2fs code just now, it seems f2fs doesn't
> > take inode_lock as some other fs like __generic_file_fsync or ubifs_fsync.
> >
> > And i_sem locks nlink / try_to_fix_pino similarly in some extent. It seems
> > no race by using d_find_alias here. Thanks again.
> >
>
> (think more little bit just now...)
>
> Thread 1: Thread 2 (fsync):
> vfs_unlink try_to_fix_pino
> f2fs_unlink
> f2fs_delete_entry
> f2fs_drop_nlink (i_sem, inode->i_nlink = 1)
>
> (... but this dentry still hashed) i_sem, check inode->i_nlink = 1
> i_sem d_find_alias
>
> d_delete
>
> I'm not sure if fsync could still use some wrong alias by chance..
> completely untested, maybe just noise...
>
Right, good observation. My patch makes it better, but it's still broken.
I don't know how to fix it. If we see i_nlink == 1 and multiple hashed
dentries, there doesn't appear to be a way to distingush which one corresponds
to the remaining link on-disk (if any; it may not even be in the dcache), and
which correspond to links that vfs_unlink() has deleted from disk but hasn't yet
done d_delete() on.
One idea would be choose one, then take inode_lock_shared(dir) and do
__f2fs_find_entry() to check if the dentry is really still on-disk. That's
heavyweight and error-prone though, and the locking could cause problems.
I'm wondering though, does f2fs really need try_to_fix_pino() at all, and did it
ever really work? It never actually updates the f2fs_inode::i_name to match the
new directory. So independently of this bug with deleted links, I don't see how
it can possibly work as intended.
- Eric
On Wed, May 06, 2020 at 12:16:13PM -0700, Eric Biggers wrote:
> On Wed, May 06, 2020 at 02:47:19PM +0800, Gao Xiang wrote:
> > On Wed, May 06, 2020 at 09:58:22AM +0800, Gao Xiang wrote:
> > > On Tue, May 05, 2020 at 06:24:28PM -0700, Eric Biggers wrote:
> > > > On Wed, May 06, 2020 at 08:14:07AM +0800, Gao Xiang wrote:
> > > > > >
> > > > > > Actually, I think this is wrong because the fsync can be done via a file
> > > > > > descriptor that was opened to a now-deleted link to the file.
> > > > >
> > > > > I'm still confused about this...
> > > > >
> > > > > I don't know what's wrong with this version from my limited knowledge?
> > > > > inode itself is locked when fsyncing, so
> > > > >
> > > > > if the fsync inode->i_nlink == 1, this inode has only one hard link
> > > > > (not deleted yet) and should belong to a single directory; and
> > > > >
> > > > > the only one parent directory would not go away (not deleted as well)
> > > > > since there are some dirents in it (not empty).
> > > > >
> > > > > Could kindly explain more so I would learn more about this scenario?
> > > > > Thanks a lot!
> > > >
> > > > i_nlink == 1 just means that there is one non-deleted link. There can be links
> > > > that have since been deleted, and file descriptors can still be open to them.
> > >
> > > Thanks for your inspiration. You are right, thanks.
> > >
> > > Correct my words... I didn't check f2fs code just now, it seems f2fs doesn't
> > > take inode_lock as some other fs like __generic_file_fsync or ubifs_fsync.
> > >
> > > And i_sem locks nlink / try_to_fix_pino similarly in some extent. It seems
> > > no race by using d_find_alias here. Thanks again.
> > >
> >
> > (think more little bit just now...)
> >
> > Thread 1: Thread 2 (fsync):
> > vfs_unlink try_to_fix_pino
> > f2fs_unlink
> > f2fs_delete_entry
> > f2fs_drop_nlink (i_sem, inode->i_nlink = 1)
> >
> > (... but this dentry still hashed) i_sem, check inode->i_nlink = 1
> > i_sem d_find_alias
> >
> > d_delete
> >
> > I'm not sure if fsync could still use some wrong alias by chance..
> > completely untested, maybe just noise...
> >
>
> Right, good observation. My patch makes it better, but it's still broken.
>
> I don't know how to fix it. If we see i_nlink == 1 and multiple hashed
> dentries, there doesn't appear to be a way to distingush which one corresponds
> to the remaining link on-disk (if any; it may not even be in the dcache), and
> which correspond to links that vfs_unlink() has deleted from disk but hasn't yet
> done d_delete() on.
>
> One idea would be choose one, then take inode_lock_shared(dir) and do
> __f2fs_find_entry() to check if the dentry is really still on-disk. That's
> heavyweight and error-prone though, and the locking could cause problems.
>
> I'm wondering though, does f2fs really need try_to_fix_pino() at all, and did it
> ever really work? It never actually updates the f2fs_inode::i_name to match the
> new directory. So independently of this bug with deleted links, I don't see how
> it can possibly work as intended.
Part of my humble opinion would be "update pino in rename/unlink/link... such ops
instead of in fsync" (maybe it makes better sense of locking)... But actually I'm
not a f2fs folk now, just curious about what the original patch resolved with
these new extra igrab/iput (as I said before, I could not find some clue previously).
Thanks,
Gao Xiang
>
> - Eric
On 2020/5/6 14:55, Chao Yu wrote:
> On 2020/5/6 9:24, Eric Biggers wrote:
>> On Wed, May 06, 2020 at 08:14:07AM +0800, Gao Xiang wrote:
>>>>
>>>> Actually, I think this is wrong because the fsync can be done via a file
>>>> descriptor that was opened to a now-deleted link to the file.
>>>
>>> I'm still confused about this...
>>>
>>> I don't know what's wrong with this version from my limited knowledge?
>>> inode itself is locked when fsyncing, so
>>>
>>> if the fsync inode->i_nlink == 1, this inode has only one hard link
>>> (not deleted yet) and should belong to a single directory; and
>>>
>>> the only one parent directory would not go away (not deleted as well)
>>> since there are some dirents in it (not empty).
>>>
>>> Could kindly explain more so I would learn more about this scenario?
>>> Thanks a lot!
>>
>> i_nlink == 1 just means that there is one non-deleted link. There can be links
>> that have since been deleted, and file descriptors can still be open to them.
>>
>>>
>>>>
>>>> We need to find the dentry whose parent directory is still exists, i.e. the
>>>> parent directory that is counting towards 'inode->i_nlink == 1'.
>>>
>>> directory counting towards 'inode->i_nlink == 1', what's happening?
>>
>> The non-deleted link is the one counted in i_nlink.
>>
>>>
>>>>
>>>> I think d_find_alias() is what we're looking for.
>>>
>>> It may be simply dentry->d_parent (stable/positive as you said before, and it's
>>> not empty). why need to d_find_alias()?
>>
>> Because we need to get the dentry that hasn't been deleted yet, which isn't
>> necessarily the one associated with the file descriptor being fsync()'ed.
>>
>>> And what is the original problem? I could not get some clue from the original
>>> patch description (I only saw some extra igrab/iput because of some unknown
>>> reasons), it there some backtrace related to the problem?
>>
>> The problem is that i_pino gets set incorrectly. I just noticed this while
>> reviewing the code. It's not hard to reproduce, e.g.:
>>
>> #include <unistd.h>
>> #include <fcntl.h>
>> #include <sys/stat.h>
>>
>> int main()
>> {
>> int fd;
>>
>> mkdir("dir1", 0700);
>> mkdir("dir2", 0700);
>> mknod("dir1/file", S_IFREG|0600, 0);
>> link("dir1/file", "dir2/file");
>> fd = open("dir2/file", O_WRONLY);
>> unlink("dir2/file");
>> write(fd, "X", 1);
>> fsync(fd);
>> }
>>
>> Then:
>>
>> sync
>> echo N | dump.f2fs -i $(stat -c %i dir1/file) /dev/vdb | grep 'i_pino'
>> echo "dir1 (correct): $(stat -c %i dir1)"
>> echo "dir2 (wrong): $(stat -c %i dir2)"
>>
>> i_pino will point to dir2 rather than dir1 as expected.
>
> Could you add above testcase into commit message of your patch? it will
> be easier to understand the issue we solved with it.
>
> In addition, how about adding this testcase in fstest as a generic one?
Oops, it's not a generic one... please ignore this.
>
>>
>> - Eric
>>
>>
>> _______________________________________________
>> Linux-f2fs-devel mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
>> .
>>
>
>
> _______________________________________________
> Linux-f2fs-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> .
>
On 2020/5/7 6:36, Gao Xiang wrote:
> On Wed, May 06, 2020 at 12:16:13PM -0700, Eric Biggers wrote:
>> On Wed, May 06, 2020 at 02:47:19PM +0800, Gao Xiang wrote:
>>> On Wed, May 06, 2020 at 09:58:22AM +0800, Gao Xiang wrote:
>>>> On Tue, May 05, 2020 at 06:24:28PM -0700, Eric Biggers wrote:
>>>>> On Wed, May 06, 2020 at 08:14:07AM +0800, Gao Xiang wrote:
>>>>>>>
>>>>>>> Actually, I think this is wrong because the fsync can be done via a file
>>>>>>> descriptor that was opened to a now-deleted link to the file.
>>>>>>
>>>>>> I'm still confused about this...
>>>>>>
>>>>>> I don't know what's wrong with this version from my limited knowledge?
>>>>>> inode itself is locked when fsyncing, so
>>>>>>
>>>>>> if the fsync inode->i_nlink == 1, this inode has only one hard link
>>>>>> (not deleted yet) and should belong to a single directory; and
>>>>>>
>>>>>> the only one parent directory would not go away (not deleted as well)
>>>>>> since there are some dirents in it (not empty).
>>>>>>
>>>>>> Could kindly explain more so I would learn more about this scenario?
>>>>>> Thanks a lot!
>>>>>
>>>>> i_nlink == 1 just means that there is one non-deleted link. There can be links
>>>>> that have since been deleted, and file descriptors can still be open to them.
>>>>
>>>> Thanks for your inspiration. You are right, thanks.
>>>>
>>>> Correct my words... I didn't check f2fs code just now, it seems f2fs doesn't
>>>> take inode_lock as some other fs like __generic_file_fsync or ubifs_fsync.
>>>>
>>>> And i_sem locks nlink / try_to_fix_pino similarly in some extent. It seems
>>>> no race by using d_find_alias here. Thanks again.
>>>>
>>>
>>> (think more little bit just now...)
>>>
>>> Thread 1: Thread 2 (fsync):
>>> vfs_unlink try_to_fix_pino
>>> f2fs_unlink
>>> f2fs_delete_entry
>>> f2fs_drop_nlink (i_sem, inode->i_nlink = 1)
>>>
>>> (... but this dentry still hashed) i_sem, check inode->i_nlink = 1
>>> i_sem d_find_alias
>>>
>>> d_delete
>>>
>>> I'm not sure if fsync could still use some wrong alias by chance..
>>> completely untested, maybe just noise...
Another race condition could be:
Thread 1 (fsync) Thread 2 (rename)
- f2fs_sync_fs
- try_to_fix_pino
- f2fs_rename
- down_write
- file_lost_pino
- up_write
- down_write
- file_got_pino
- up_write
Thanks,
>>>
>>
>> Right, good observation. My patch makes it better, but it's still broken.
>>
>> I don't know how to fix it. If we see i_nlink == 1 and multiple hashed
>> dentries, there doesn't appear to be a way to distingush which one corresponds
>> to the remaining link on-disk (if any; it may not even be in the dcache), and
>> which correspond to links that vfs_unlink() has deleted from disk but hasn't yet
>> done d_delete() on.
>>
>> One idea would be choose one, then take inode_lock_shared(dir) and do
>> __f2fs_find_entry() to check if the dentry is really still on-disk. That's
>> heavyweight and error-prone though, and the locking could cause problems.
>>
>> I'm wondering though, does f2fs really need try_to_fix_pino() at all, and did it
>> ever really work? It never actually updates the f2fs_inode::i_name to match the
>> new directory. So independently of this bug with deleted links, I don't see how
>> it can possibly work as intended.
>
> Part of my humble opinion would be "update pino in rename/unlink/link... such ops
> instead of in fsync" (maybe it makes better sense of locking)... But actually I'm
> not a f2fs folk now, just curious about what the original patch resolved with
> these new extra igrab/iput (as I said before, I could not find some clue previously).
>
> Thanks,
> Gao Xiang
>
>>
>> - Eric
>
>
> _______________________________________________
> Linux-f2fs-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> .
>
Hi Chao,
On Thu, May 07, 2020 at 02:38:39PM +0800, Chao Yu wrote:
> On 2020/5/7 6:36, Gao Xiang wrote:
> > On Wed, May 06, 2020 at 12:16:13PM -0700, Eric Biggers wrote:
> >> On Wed, May 06, 2020 at 02:47:19PM +0800, Gao Xiang wrote:
> >>> On Wed, May 06, 2020 at 09:58:22AM +0800, Gao Xiang wrote:
> >>>> On Tue, May 05, 2020 at 06:24:28PM -0700, Eric Biggers wrote:
> >>>>> On Wed, May 06, 2020 at 08:14:07AM +0800, Gao Xiang wrote:
> >>>>>>>
> >>>>>>> Actually, I think this is wrong because the fsync can be done via a file
> >>>>>>> descriptor that was opened to a now-deleted link to the file.
> >>>>>>
> >>>>>> I'm still confused about this...
> >>>>>>
> >>>>>> I don't know what's wrong with this version from my limited knowledge?
> >>>>>> inode itself is locked when fsyncing, so
> >>>>>>
> >>>>>> if the fsync inode->i_nlink == 1, this inode has only one hard link
> >>>>>> (not deleted yet) and should belong to a single directory; and
> >>>>>>
> >>>>>> the only one parent directory would not go away (not deleted as well)
> >>>>>> since there are some dirents in it (not empty).
> >>>>>>
> >>>>>> Could kindly explain more so I would learn more about this scenario?
> >>>>>> Thanks a lot!
> >>>>>
> >>>>> i_nlink == 1 just means that there is one non-deleted link. There can be links
> >>>>> that have since been deleted, and file descriptors can still be open to them.
> >>>>
> >>>> Thanks for your inspiration. You are right, thanks.
> >>>>
> >>>> Correct my words... I didn't check f2fs code just now, it seems f2fs doesn't
> >>>> take inode_lock as some other fs like __generic_file_fsync or ubifs_fsync.
> >>>>
> >>>> And i_sem locks nlink / try_to_fix_pino similarly in some extent. It seems
> >>>> no race by using d_find_alias here. Thanks again.
> >>>>
> >>>
> >>> (think more little bit just now...)
> >>>
> >>> Thread 1: Thread 2 (fsync):
> >>> vfs_unlink try_to_fix_pino
> >>> f2fs_unlink
> >>> f2fs_delete_entry
> >>> f2fs_drop_nlink (i_sem, inode->i_nlink = 1)
> >>>
> >>> (... but this dentry still hashed) i_sem, check inode->i_nlink = 1
> >>> i_sem d_find_alias
> >>>
> >>> d_delete
> >>>
> >>> I'm not sure if fsync could still use some wrong alias by chance..
> >>> completely untested, maybe just noise...
>
> Another race condition could be:
>
> Thread 1 (fsync) Thread 2 (rename)
> - f2fs_sync_fs
> - try_to_fix_pino
> - f2fs_rename
> - down_write
> - file_lost_pino
> - up_write
> - down_write
> - file_got_pino
> - up_write
Yes, IMHO, I think it could be not proper to
take dir lock in fsync path anyway...
I would suggest as before (if it needs to be fixed).
And it seems no significant performance difference.
Thanks,
Gao Xiang
>
> Thanks,
>
> >>>
> >>
> >> Right, good observation. My patch makes it better, but it's still broken.
> >>
> >> I don't know how to fix it. If we see i_nlink == 1 and multiple hashed
> >> dentries, there doesn't appear to be a way to distingush which one corresponds
> >> to the remaining link on-disk (if any; it may not even be in the dcache), and
> >> which correspond to links that vfs_unlink() has deleted from disk but hasn't yet
> >> done d_delete() on.
> >>
> >> One idea would be choose one, then take inode_lock_shared(dir) and do
> >> __f2fs_find_entry() to check if the dentry is really still on-disk. That's
> >> heavyweight and error-prone though, and the locking could cause problems.
> >>
> >> I'm wondering though, does f2fs really need try_to_fix_pino() at all, and did it
> >> ever really work? It never actually updates the f2fs_inode::i_name to match the
> >> new directory. So independently of this bug with deleted links, I don't see how
> >> it can possibly work as intended.
> >
> > Part of my humble opinion would be "update pino in rename/unlink/link... such ops
> > instead of in fsync" (maybe it makes better sense of locking)... But actually I'm
> > not a f2fs folk now, just curious about what the original patch resolved with
> > these new extra igrab/iput (as I said before, I could not find some clue previously).
> >
> > Thanks,
> > Gao Xiang
> >
> >>
> >> - Eric
> >
> >
> > _______________________________________________
> > Linux-f2fs-devel mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> > .
> >