2017-06-06 20:45:29

by Benjamin Coddington

[permalink] [raw]
Subject: [PATCH 0/3 v4] Fixups for l_pid

LTP fcntl tests (fcntl11 fcntl14 fcntl17 fcntl19 fcntl20 fcntl21) have been
failing for NFSv4 mounts due to an unexpected l_pid. What follows are some
fixups:

v2: - Rebase onto linux-next
- Revert back to using the stack in locks_mandatory_area(), and fixup
patch description for 1/3

v3: - The lkp-robot found some serious per_thread_ops performance
regressions for v1 and v2, so this version changes things around to not
acquire a reference to struct pid in fl_nspid for every lock. Instead,
it drops fl_nspid altogether, and defers the lookup of the
namespace-translated pid until it actually needed.

v4: - Instead of looking up the virtual pid by way of referencing the struct
task of the that pid, instead use find_pid_ns() and pid_nr_ns(), which
avoids a the problem where we race to get a reference to the struct task
while it may be freed.

Benjamin Coddington (3):
fs/locks: Use allocation rather than the stack in fcntl_getlk()
fs/locks: Remove fl_nspid
fs/locks: Use fs-specific l_pid for remote locks

fs/locks.c | 116 ++++++++++++++++++++++++++++++++---------------------
include/linux/fs.h | 2 +-
2 files changed, 72 insertions(+), 46 deletions(-)

--
2.9.3


2017-06-06 20:45:28

by Benjamin Coddington

[permalink] [raw]
Subject: [PATCH 2/3] fs/locks: Remove fl_nspid

Since commit c69899a17ca4 "NFSv4: Update of VFS byte range lock must be
atomic with the stateid update", NFSv4 has been inserting locks in rpciod
worker context. The result is that the file_lock's fl_nspid is the
kworker's pid instead of the original userspace pid.

The fl_nspid is only used to represent the namespaced virtual pid number
when displaying locks or returning from F_GETLK. There's no reason to set
it for every inserted lock, since we can usually just look it up from
fl_pid. So, instead of looking up and holding struct pid for every lock,
let's just look up the virtual pid number from fl_pid when it is needed.
That means we can remove fl_nspid entirely.

Signed-off-by: Benjamin Coddington <[email protected]>
---
fs/locks.c | 52 +++++++++++++++++++++++++++++-----------------------
include/linux/fs.h | 1 -
2 files changed, 29 insertions(+), 24 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index d7daa6c8932f..8d48e4c42ed3 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -733,7 +733,6 @@ static void locks_wake_up_blocks(struct file_lock *blocker)
static void
locks_insert_lock_ctx(struct file_lock *fl, struct list_head *before)
{
- fl->fl_nspid = get_pid(task_tgid(current));
list_add_tail(&fl->fl_list, before);
locks_insert_global_locks(fl);
}
@@ -743,10 +742,6 @@ locks_unlink_lock_ctx(struct file_lock *fl)
{
locks_delete_global_locks(fl);
list_del_init(&fl->fl_list);
- if (fl->fl_nspid) {
- put_pid(fl->fl_nspid);
- fl->fl_nspid = NULL;
- }
locks_wake_up_blocks(fl);
}

@@ -823,8 +818,6 @@ posix_test_lock(struct file *filp, struct file_lock *fl)
list_for_each_entry(cfl, &ctx->flc_posix, fl_list) {
if (posix_locks_conflict(fl, cfl)) {
locks_copy_conflock(fl, cfl);
- if (cfl->fl_nspid)
- fl->fl_pid = pid_vnr(cfl->fl_nspid);
goto out;
}
}
@@ -2048,6 +2041,25 @@ int vfs_test_lock(struct file *filp, struct file_lock *fl)
}
EXPORT_SYMBOL_GPL(vfs_test_lock);

+/**
+ * locks_translate_pid - translate a pid number into a namespace
+ * @nr: The pid number in the init_pid_ns
+ * @ns: The namespace into which the pid should be translated
+ *
+ * Used to tranlate a fl_pid into a namespace virtual pid number
+ */
+static pid_t locks_translate_pid(int init_nr, struct pid_namespace *ns)
+{
+ pid_t vnr;
+ struct pid *pid;
+
+ rcu_read_lock();
+ pid = find_pid_ns(init_nr, &init_pid_ns);
+ vnr = pid_nr_ns(pid, ns);
+ rcu_read_unlock();
+ return vnr;
+}
+
static int posix_lock_to_flock(struct flock *flock, struct file_lock *fl)
{
flock->l_pid = IS_OFDLCK(fl) ? -1 : fl->fl_pid;
@@ -2584,22 +2596,16 @@ static void lock_get_status(struct seq_file *f, struct file_lock *fl,
{
struct inode *inode = NULL;
unsigned int fl_pid;
+ struct pid_namespace *proc_pidns = file_inode(f->file)->i_sb->s_fs_info;

- if (fl->fl_nspid) {
- struct pid_namespace *proc_pidns = file_inode(f->file)->i_sb->s_fs_info;
-
- /* Don't let fl_pid change based on who is reading the file */
- fl_pid = pid_nr_ns(fl->fl_nspid, proc_pidns);
-
- /*
- * If there isn't a fl_pid don't display who is waiting on
- * the lock if we are called from locks_show, or if we are
- * called from __show_fd_info - skip lock entirely
- */
- if (fl_pid == 0)
- return;
- } else
- fl_pid = fl->fl_pid;
+ fl_pid = locks_translate_pid(fl->fl_pid, proc_pidns);
+ /*
+ * If there isn't a fl_pid don't display who is waiting on
+ * the lock if we are called from locks_show, or if we are
+ * called from __show_fd_info - skip lock entirely
+ */
+ if (fl_pid == 0)
+ return;

if (fl->fl_file != NULL)
inode = locks_inode(fl->fl_file);
@@ -2674,7 +2680,7 @@ static int locks_show(struct seq_file *f, void *v)

fl = hlist_entry(v, struct file_lock, fl_link);

- if (fl->fl_nspid && !pid_nr_ns(fl->fl_nspid, proc_pidns))
+ if (locks_translate_pid(fl->fl_pid, proc_pidns) == 0)
return 0;

lock_get_status(f, fl, iter->li_pos, "");
diff --git a/include/linux/fs.h b/include/linux/fs.h
index aa4affb38c39..b013fac515f7 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -984,7 +984,6 @@ struct file_lock {
unsigned char fl_type;
unsigned int fl_pid;
int fl_link_cpu; /* what cpu's list is this on? */
- struct pid *fl_nspid;
wait_queue_head_t fl_wait;
struct file *fl_file;
loff_t fl_start;
--
2.9.3

2017-06-06 20:45:27

by Benjamin Coddington

[permalink] [raw]
Subject: [PATCH 1/3] fs/locks: Use allocation rather than the stack in fcntl_getlk()

Struct file_lock is fairly large, so let's save some space on the stack by
using an allocation for struct file_lock in fcntl_getlk(), just as we do
for fcntl_setlk().

Signed-off-by: Benjamin Coddington <[email protected]>
---
fs/locks.c | 46 ++++++++++++++++++++++++++--------------------
1 file changed, 26 insertions(+), 20 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index afefeb4ad6de..d7daa6c8932f 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -2086,14 +2086,17 @@ static void posix_lock_to_flock64(struct flock64 *flock, struct file_lock *fl)
*/
int fcntl_getlk(struct file *filp, unsigned int cmd, struct flock *flock)
{
- struct file_lock file_lock;
+ struct file_lock *fl;
int error;

+ fl = locks_alloc_lock();
+ if (fl == NULL)
+ return -ENOMEM;
error = -EINVAL;
if (flock->l_type != F_RDLCK && flock->l_type != F_WRLCK)
goto out;

- error = flock_to_posix_lock(filp, &file_lock, flock);
+ error = flock_to_posix_lock(filp, fl, flock);
if (error)
goto out;

@@ -2103,23 +2106,22 @@ int fcntl_getlk(struct file *filp, unsigned int cmd, struct flock *flock)
goto out;

cmd = F_GETLK;
- file_lock.fl_flags |= FL_OFDLCK;
- file_lock.fl_owner = filp;
+ fl->fl_flags |= FL_OFDLCK;
+ fl->fl_owner = filp;
}

- error = vfs_test_lock(filp, &file_lock);
+ error = vfs_test_lock(filp, fl);
if (error)
goto out;

- flock->l_type = file_lock.fl_type;
- if (file_lock.fl_type != F_UNLCK) {
- error = posix_lock_to_flock(flock, &file_lock);
+ flock->l_type = fl->fl_type;
+ if (fl->fl_type != F_UNLCK) {
+ error = posix_lock_to_flock(flock, fl);
if (error)
- goto rel_priv;
+ goto out;
}
-rel_priv:
- locks_release_private(&file_lock);
out:
+ locks_free_lock(fl);
return error;
}

@@ -2298,14 +2300,18 @@ int fcntl_setlk(unsigned int fd, struct file *filp, unsigned int cmd,
*/
int fcntl_getlk64(struct file *filp, unsigned int cmd, struct flock64 *flock)
{
- struct file_lock file_lock;
+ struct file_lock *fl;
int error;

+ fl = locks_alloc_lock();
+ if (fl == NULL)
+ return -ENOMEM;
+
error = -EINVAL;
if (flock->l_type != F_RDLCK && flock->l_type != F_WRLCK)
goto out;

- error = flock64_to_posix_lock(filp, &file_lock, flock);
+ error = flock64_to_posix_lock(filp, fl, flock);
if (error)
goto out;

@@ -2315,20 +2321,20 @@ int fcntl_getlk64(struct file *filp, unsigned int cmd, struct flock64 *flock)
goto out;

cmd = F_GETLK64;
- file_lock.fl_flags |= FL_OFDLCK;
- file_lock.fl_owner = filp;
+ fl->fl_flags |= FL_OFDLCK;
+ fl->fl_owner = filp;
}

- error = vfs_test_lock(filp, &file_lock);
+ error = vfs_test_lock(filp, fl);
if (error)
goto out;

- flock->l_type = file_lock.fl_type;
- if (file_lock.fl_type != F_UNLCK)
- posix_lock_to_flock64(flock, &file_lock);
+ flock->l_type = fl->fl_type;
+ if (fl->fl_type != F_UNLCK)
+ posix_lock_to_flock64(flock, fl);

- locks_release_private(&file_lock);
out:
+ locks_free_lock(fl);
return error;
}

--
2.9.3

2017-06-06 20:45:26

by Benjamin Coddington

[permalink] [raw]
Subject: [PATCH 3/3] fs/locks: Use fs-specific l_pid for remote locks

Now that we're translating fl_pid for F_GETLK and /proc/locks, we need to
handle the case where a remote filesystem directly sets fl_pid. In that
case, the fl_pid should not be translated into a local pid namespace. If
the filesystem implements the lock operation, set a flag to return the
lock's fl_pid value directly, rather translate it.

Signed-off-by: Benjamin Coddington <[email protected]>
---
fs/locks.c | 22 ++++++++++++++++++----
include/linux/fs.h | 1 +
2 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index 8d48e4c42ed3..206a46d28bbd 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -2034,8 +2034,10 @@ SYSCALL_DEFINE2(flock, unsigned int, fd, unsigned int, cmd)
*/
int vfs_test_lock(struct file *filp, struct file_lock *fl)
{
- if (filp->f_op->lock && is_remote_lock(filp))
+ if (filp->f_op->lock && is_remote_lock(filp)) {
+ fl->fl_flags |= FL_PID_PRIV;
return filp->f_op->lock(filp, F_GETLK, fl);
+ }
posix_test_lock(filp, fl);
return 0;
}
@@ -2060,9 +2062,18 @@ static pid_t locks_translate_pid(int init_nr, struct pid_namespace *ns)
return vnr;
}

+static pid_t flock_translate_pid(struct file_lock *fl)
+{
+ if (IS_OFDLCK(fl))
+ return -1;
+ if (fl->fl_flags & FL_PID_PRIV)
+ return fl->fl_pid;
+ return locks_translate_pid(fl->fl_pid, task_active_pid_ns(current));
+}
+
static int posix_lock_to_flock(struct flock *flock, struct file_lock *fl)
{
- flock->l_pid = IS_OFDLCK(fl) ? -1 : fl->fl_pid;
+ flock->l_pid = flock_translate_pid(fl);
#if BITS_PER_LONG == 32
/*
* Make sure we can represent the posix lock via
@@ -2084,7 +2095,7 @@ static int posix_lock_to_flock(struct flock *flock, struct file_lock *fl)
#if BITS_PER_LONG == 32
static void posix_lock_to_flock64(struct flock64 *flock, struct file_lock *fl)
{
- flock->l_pid = IS_OFDLCK(fl) ? -1 : fl->fl_pid;
+ flock->l_pid = flock_translate_pid(fl);
flock->l_start = fl->fl_start;
flock->l_len = fl->fl_end == OFFSET_MAX ? 0 :
fl->fl_end - fl->fl_start + 1;
@@ -2598,7 +2609,10 @@ static void lock_get_status(struct seq_file *f, struct file_lock *fl,
unsigned int fl_pid;
struct pid_namespace *proc_pidns = file_inode(f->file)->i_sb->s_fs_info;

- fl_pid = locks_translate_pid(fl->fl_pid, proc_pidns);
+ if (fl->fl_flags & FL_PID_PRIV)
+ fl_pid = fl->fl_pid;
+ else
+ fl_pid = locks_translate_pid(fl->fl_pid, proc_pidns);
/*
* If there isn't a fl_pid don't display who is waiting on
* the lock if we are called from locks_show, or if we are
diff --git a/include/linux/fs.h b/include/linux/fs.h
index b013fac515f7..179496a9719d 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -908,6 +908,7 @@ static inline struct file *get_file(struct file *f)
#define FL_UNLOCK_PENDING 512 /* Lease is being broken */
#define FL_OFDLCK 1024 /* lock is "owned" by struct file */
#define FL_LAYOUT 2048 /* outstanding pNFS layout */
+#define FL_PID_PRIV 4096 /* F_GETLK should report fl_pid */

#define FL_CLOSE_POSIX (FL_POSIX | FL_CLOSE)

--
2.9.3

2017-06-07 11:40:14

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH 3/3] fs/locks: Use fs-specific l_pid for remote locks

On Tue, 2017-06-06 at 16:45 -0400, Benjamin Coddington wrote:
> Now that we're translating fl_pid for F_GETLK and /proc/locks, we need to
> handle the case where a remote filesystem directly sets fl_pid. In that
> case, the fl_pid should not be translated into a local pid namespace. If
> the filesystem implements the lock operation, set a flag to return the
> lock's fl_pid value directly, rather translate it.
>

Actually, you're not translating anything for F_GETLK until we get to
this patch. Patch #2 in this series removes the fl_nspid field, but the
pid translation isn't fixed until here. That does mean a nominal
regression here in how fl_pid is reported between the two.

Would it be best to squash #2 and #3 together? Or maybe just go ahead
and universally translate the fl_pid field until you add the flag in
this patch?

Also to make sure I understand: task->tgid will always represent the
task's pid in the init_pid_ns, right?

Other than the minor bisectability concern, I think this looks good.
Nice work!

> Signed-off-by: Benjamin Coddington <[email protected]>
> ---
> fs/locks.c | 22 ++++++++++++++++++----
> include/linux/fs.h | 1 +
> 2 files changed, 19 insertions(+), 4 deletions(-)
>
> diff --git a/fs/locks.c b/fs/locks.c
> index 8d48e4c42ed3..206a46d28bbd 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -2034,8 +2034,10 @@ SYSCALL_DEFINE2(flock, unsigned int, fd, unsigned int, cmd)
> */
> int vfs_test_lock(struct file *filp, struct file_lock *fl)
> {
> - if (filp->f_op->lock && is_remote_lock(filp))
> + if (filp->f_op->lock && is_remote_lock(filp)) {
> + fl->fl_flags |= FL_PID_PRIV;
> return filp->f_op->lock(filp, F_GETLK, fl);
> + }
> posix_test_lock(filp, fl);
> return 0;
> }
> @@ -2060,9 +2062,18 @@ static pid_t locks_translate_pid(int init_nr, struct pid_namespace *ns)
> return vnr;
> }
>
> +static pid_t flock_translate_pid(struct file_lock *fl)
> +{
> + if (IS_OFDLCK(fl))
> + return -1;
> + if (fl->fl_flags & FL_PID_PRIV)
> + return fl->fl_pid;
> + return locks_translate_pid(fl->fl_pid, task_active_pid_ns(current));
> +}
> +
> static int posix_lock_to_flock(struct flock *flock, struct file_lock *fl)
> {
> - flock->l_pid = IS_OFDLCK(fl) ? -1 : fl->fl_pid;
> + flock->l_pid = flock_translate_pid(fl);
> #if BITS_PER_LONG == 32
> /*
> * Make sure we can represent the posix lock via
> @@ -2084,7 +2095,7 @@ static int posix_lock_to_flock(struct flock *flock, struct file_lock *fl)
> #if BITS_PER_LONG == 32
> static void posix_lock_to_flock64(struct flock64 *flock, struct file_lock *fl)
> {
> - flock->l_pid = IS_OFDLCK(fl) ? -1 : fl->fl_pid;
> + flock->l_pid = flock_translate_pid(fl);
> flock->l_start = fl->fl_start;
> flock->l_len = fl->fl_end == OFFSET_MAX ? 0 :
> fl->fl_end - fl->fl_start + 1;
> @@ -2598,7 +2609,10 @@ static void lock_get_status(struct seq_file *f, struct file_lock *fl,
> unsigned int fl_pid;
> struct pid_namespace *proc_pidns = file_inode(f->file)->i_sb->s_fs_info;
>
> - fl_pid = locks_translate_pid(fl->fl_pid, proc_pidns);
> + if (fl->fl_flags & FL_PID_PRIV)
> + fl_pid = fl->fl_pid;
> + else
> + fl_pid = locks_translate_pid(fl->fl_pid, proc_pidns);
> /*
> * If there isn't a fl_pid don't display who is waiting on
> * the lock if we are called from locks_show, or if we are
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index b013fac515f7..179496a9719d 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -908,6 +908,7 @@ static inline struct file *get_file(struct file *f)
> #define FL_UNLOCK_PENDING 512 /* Lease is being broken */
> #define FL_OFDLCK 1024 /* lock is "owned" by struct file */
> #define FL_LAYOUT 2048 /* outstanding pNFS layout */
> +#define FL_PID_PRIV 4096 /* F_GETLK should report fl_pid */
>
> #define FL_CLOSE_POSIX (FL_POSIX | FL_CLOSE)
>

--
Jeff Layton <[email protected]>

2017-06-07 11:50:35

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH 3/3] fs/locks: Use fs-specific l_pid for remote locks

On Tue, 2017-06-06 at 16:45 -0400, Benjamin Coddington wrote:
> Now that we're translating fl_pid for F_GETLK and /proc/locks, we need to
> handle the case where a remote filesystem directly sets fl_pid. In that
> case, the fl_pid should not be translated into a local pid namespace. If
> the filesystem implements the lock operation, set a flag to return the
> lock's fl_pid value directly, rather translate it.
>
> Signed-off-by: Benjamin Coddington <[email protected]>
> ---
> fs/locks.c | 22 ++++++++++++++++++----
> include/linux/fs.h | 1 +
> 2 files changed, 19 insertions(+), 4 deletions(-)
>
> diff --git a/fs/locks.c b/fs/locks.c
> index 8d48e4c42ed3..206a46d28bbd 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -2034,8 +2034,10 @@ SYSCALL_DEFINE2(flock, unsigned int, fd, unsigned int, cmd)
> */
> int vfs_test_lock(struct file *filp, struct file_lock *fl)
> {
> - if (filp->f_op->lock && is_remote_lock(filp))
> + if (filp->f_op->lock && is_remote_lock(filp)) {
> + fl->fl_flags |= FL_PID_PRIV;
> return filp->f_op->lock(filp, F_GETLK, fl);
> + }
> posix_test_lock(filp, fl);
> return 0;
> }

I do have one concern here with setting the flag at this point. It's
possible with NFS that we'll end up setting a lock locally if we have a
delegation (also true for CIFS and probably Ceph -- maybe others too).

Here though, you're setting FL_PID_PRIV so those locks will always show
a l_pid of current->tgid, even when you're querying it from a different
pid namespace.

If we want to go this route, then you'll also need to fix NFS to clear
that flag when it sets the lock locally, and audit other fs' that have a
->lock operation for the same thing.

However, I think it might be cleaner to have the filesystems set that
flag to opt out of the default pid translation.

Thoughts?

> @@ -2060,9 +2062,18 @@ static pid_t locks_translate_pid(int init_nr, struct pid_namespace *ns)
> return vnr;
> }
>
> +static pid_t flock_translate_pid(struct file_lock *fl)
> +{
> + if (IS_OFDLCK(fl))
> + return -1;
> + if (fl->fl_flags & FL_PID_PRIV)
> + return fl->fl_pid;
> + return locks_translate_pid(fl->fl_pid, task_active_pid_ns(current));
> +}
> +
> static int posix_lock_to_flock(struct flock *flock, struct file_lock *fl)
> {
> - flock->l_pid = IS_OFDLCK(fl) ? -1 : fl->fl_pid;
> + flock->l_pid = flock_translate_pid(fl);
> #if BITS_PER_LONG == 32
> /*
> * Make sure we can represent the posix lock via
> @@ -2084,7 +2095,7 @@ static int posix_lock_to_flock(struct flock *flock, struct file_lock *fl)
> #if BITS_PER_LONG == 32
> static void posix_lock_to_flock64(struct flock64 *flock, struct file_lock *fl)
> {
> - flock->l_pid = IS_OFDLCK(fl) ? -1 : fl->fl_pid;
> + flock->l_pid = flock_translate_pid(fl);
> flock->l_start = fl->fl_start;
> flock->l_len = fl->fl_end == OFFSET_MAX ? 0 :
> fl->fl_end - fl->fl_start + 1;
> @@ -2598,7 +2609,10 @@ static void lock_get_status(struct seq_file *f, struct file_lock *fl,
> unsigned int fl_pid;
> struct pid_namespace *proc_pidns = file_inode(f->file)->i_sb->s_fs_info;
>
> - fl_pid = locks_translate_pid(fl->fl_pid, proc_pidns);
> + if (fl->fl_flags & FL_PID_PRIV)
> + fl_pid = fl->fl_pid;
> + else
> + fl_pid = locks_translate_pid(fl->fl_pid, proc_pidns);
> /*
> * If there isn't a fl_pid don't display who is waiting on
> * the lock if we are called from locks_show, or if we are
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index b013fac515f7..179496a9719d 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -908,6 +908,7 @@ static inline struct file *get_file(struct file *f)
> #define FL_UNLOCK_PENDING 512 /* Lease is being broken */
> #define FL_OFDLCK 1024 /* lock is "owned" by struct file */
> #define FL_LAYOUT 2048 /* outstanding pNFS layout */
> +#define FL_PID_PRIV 4096 /* F_GETLK should report fl_pid */
>
> #define FL_CLOSE_POSIX (FL_POSIX | FL_CLOSE)
>

--
Jeff Layton <[email protected]>

2017-06-19 12:38:02

by Benjamin Coddington

[permalink] [raw]
Subject: Re: [PATCH 3/3] fs/locks: Use fs-specific l_pid for remote locks

Apologies for the delayed response..

On 7 Jun 2017, at 7:40, Jeff Layton wrote:

> On Tue, 2017-06-06 at 16:45 -0400, Benjamin Coddington wrote:
>> Now that we're translating fl_pid for F_GETLK and /proc/locks, we need to
>> handle the case where a remote filesystem directly sets fl_pid. In that
>> case, the fl_pid should not be translated into a local pid namespace. If
>> the filesystem implements the lock operation, set a flag to return the
>> lock's fl_pid value directly, rather translate it.
>>
>
> Actually, you're not translating anything for F_GETLK until we get to
> this patch. Patch #2 in this series removes the fl_nspid field, but the
> pid translation isn't fixed until here. That does mean a nominal
> regression here in how fl_pid is reported between the two.

Good catch.

> Would it be best to squash #2 and #3 together? Or maybe just go ahead
> and universally translate the fl_pid field until you add the flag in
> this patch?

I'll send a v4 that universally translates the fl_pid field until this
patch. I think the first two patches should be separate.

> Also to make sure I understand: task->tgid will always represent the
> task's pid in the init_pid_ns, right?

Yes. (I was incorrect on IRC last week), as is seen in kernel/fork.c:
1748 if (pid != &init_struct_pid) {
1749 pid = alloc_pid(p->nsproxy->pid_ns_for_children);
...
1754 }
...
1785 p->pid = pid_nr(pid);
1786 if (clone_flags & CLONE_THREAD) {
...
1789 p->tgid = current->tgid;
1790 } else {
...
1796 p->tgid = p->pid;
1797 }

.. and include/linux/pid.h:
153 /*
154 * the helpers to get the pid's id seen from different namespaces
...
156 * pid_nr() : global id, i.e. the id seen from the init namespace;
...
162 */
163
164 static inline pid_t pid_nr(struct pid *pid)
165 {
166 pid_t nr = 0;
167 if (pid)
168 nr = pid->numbers[0].nr;
169 return nr;
170 }

Ben

2017-06-19 12:53:09

by Benjamin Coddington

[permalink] [raw]
Subject: Re: [PATCH 3/3] fs/locks: Use fs-specific l_pid for remote locks

On 19 Jun 2017, at 8:37, Benjamin Coddington wrote:

> Apologies for the delayed response..
>
> On 7 Jun 2017, at 7:40, Jeff Layton wrote:
>
>> On Tue, 2017-06-06 at 16:45 -0400, Benjamin Coddington wrote:
>>> Now that we're translating fl_pid for F_GETLK and /proc/locks, we need to
>>> handle the case where a remote filesystem directly sets fl_pid. In that
>>> case, the fl_pid should not be translated into a local pid namespace. If
>>> the filesystem implements the lock operation, set a flag to return the
>>> lock's fl_pid value directly, rather translate it.
>>>
>>
>> Actually, you're not translating anything for F_GETLK until we get to
>> this patch. Patch #2 in this series removes the fl_nspid field, but the
>> pid translation isn't fixed until here. That does mean a nominal
>> regression here in how fl_pid is reported between the two.
>
> Good catch.
>
>> Would it be best to squash #2 and #3 together? Or maybe just go ahead
>> and universally translate the fl_pid field until you add the flag in
>> this patch?
>
> I'll send a v4 that universally translates the fl_pid field until this
> patch. I think the first two patches should be separate.

Ah, but /2 and 3/ should just be squashed, yes I agree with that.

Ben